## Meta Characters for Regular Expressions

In regular expressions, meta characters are special characters that have a predefined meaning and are used to build patterns for matching strings. These meta characters provide powerful and flexible matching capabilities. Here are some commonly used meta characters in regular expressions:

In [3]:
# lets the import the "re" package, which will help us to perform operations on regular expressions
import re

* . (dot): Matches any character except a newline. For example, the pattern c.t would match "cat", "cut", "cot", and so on.

In [4]:
# lets take some strings
s1 = "cat"
s2 = "cut"
s3 = "cot"
s4 = "pot"

# let's try to match characters using the dot Metacharacter
print(re.search('c.t', s1))
print(re.search('c.t', s2))
print(re.search('c.t', s3))
print(re.search('c.t', s4))

<re.Match object; span=(0, 3), match='cat'>
<re.Match object; span=(0, 3), match='cut'>
<re.Match object; span=(0, 3), match='cot'>
None


* ^ (caret): Matches the start of a string. For example, the pattern ^Hello would match a string that starts with "Hello".

In [6]:
# lets take some strings
s1 = 'mumbai'
s2 = 'kolkata'
s3 = 'chennai'
s4 = 'delhi'

# lets use the caret meta character to match the string which starts with letter "m"
print(re.search('^m', s1))
print(re.search('^m', s2))
print(re.search('^m', s3))
print(re.search('^m', s4))

<re.Match object; span=(0, 1), match='m'>
None
None
None


* (dollar sign): Matches the end of a string. For example, the pattern world$ would match a string that ends with "world".

In [7]:
# lets take some strings
s1 = 'mumbai'
s2 = 'kolkata'
s3 = 'chennai'
s4 = 'delhi'

# lets use the dollar meta character to match the string which ends with letter "a"
print(re.search('a$', s1))
print(re.search('a$', s2))
print(re.search('a$', s3))
print(re.search('a$', s4))

None
<re.Match object; span=(6, 7), match='a'>
None
None


* (asterisk): Matches zero or more occurrences of the preceding character. For example, the pattern ab*c would match "ac", "abc", "abbc", and so on.

In [8]:
# lets take some strings
s1 = "ac"
s2 = "abc"
s3 = "abbc"
s4 = "cab"

# let's use the asterisk meta character to match the string where we zero or more occurences of the preceding character.
print(re.search('ab*c', s1))
print(re.search('ab*c', s2))
print(re.search('ab*c', s3))
print(re.search('ab*c', s4))

<re.Match object; span=(0, 2), match='ac'>
<re.Match object; span=(0, 3), match='abc'>
<re.Match object; span=(0, 4), match='abbc'>
None


* (plus): Matches one or more occurrences of the preceding character. For example, the pattern go+l would match "gol", "gool", "goooool", and so on.

In [9]:
# lets take some strings
s1 = "ac"
s2 = "abc"
s3 = "abbc"
s4 = "cab"

# let's use the plus meta character to match the string where we one or more occurences of the preceding character.
print(re.search('ab+c', s1))
print(re.search('ab+c', s2))
print(re.search('ab+c', s3))
print(re.search('ab+c', s4))

None
<re.Match object; span=(0, 3), match='abc'>
<re.Match object; span=(0, 4), match='abbc'>
None


* ? (question mark): Matches zero or one occurrence of the preceding character. For example, the pattern colou?r would match "color" and "colour".

In [10]:
# lets take some strings
s1 = "color"
s2 = "colour"
s3 = "colouur"

# let's use the plus meta character to match the string where we Zero or one occurence of the preceding character.
print(re.search('colou?r', s1))
print(re.search('colou?r', s2))
print(re.search('colou?r', s3))

<re.Match object; span=(0, 5), match='color'>
<re.Match object; span=(0, 6), match='colour'>
None


* (square brackets): Matches any single character within the brackets. For example, the pattern [aeiou] would match any vowel character.

In [11]:
# lets take string
s1 = "My Name is Akash Khanna, I Live in Mumbai."
s2 = "I am a Hosewife"
s3 = "He Nevers loves to cook "

# let's try to search for a set of Characters in the string using the []
print(re.search('[in]', s1))
print(re.search('[in]', s2))
print(re.search('[in]', s3))

<re.Match object; span=(8, 9), match='i'>
<re.Match object; span=(12, 13), match='i'>
None


* [^ ] (caret inside square brackets): Matches any single character that is not within the brackets. For example, the pattern [^0-9] would match any non-digit character.

In [12]:
# lets take some examples
s1 = "I am an actor"
s2 = "I earn 25 Lakhs per annum"
s3 = "I stay in Mumbai"
s4 = "4590"

# lets use the caret inside square bracket meta character to match it has any numbers.
print(re.search('[^0-9]', s1))
print(re.search('[^0-9]', s2))
print(re.search('[^0-9]', s3))
print(re.search('[^0-9]', s4))

<re.Match object; span=(0, 1), match='I'>
<re.Match object; span=(0, 1), match='I'>
<re.Match object; span=(0, 1), match='I'>
None


* | (pipe): Acts as an OR operator, allowing for multiple alternatives. For example, the pattern apple|orange would match either "apple" or "orange".

In [None]:
# lets take some strings
s1 = "I love to eat mango"
s2 = "I love to eat banana"
s3 = "I love to eat orange"
s4 = "I love to eat papaya"

# let's use the pipe operator to match multiple alternatives from given sentences.
print(re.search('mango|papaya', s1))
print(re.search('mango|papaya', s2))
print(re.search('mango|papaya', s3))
print(re.search('mango|papaya', s4))