# Metacharacters
<br>

- ### Metacharacters in regular expressions are special characters that have a symbolic meaning and are used to define the search pattern. 
<br>

- ### They provide flexibility and allow you to create more complex and powerful regular expressions. 

<br>

- ### Here are some commonly used metacharacters:

<br>

### 1 . (Dot)     
- ### --> Single Occurrence
<br>

### 2 ^ (Start With)  
- ### --> Start With
<br>

### 3 (End With) $    
- ### --> Ends With
<br>

### 4 * (Star)        
- ### ab* --> a a ab abb abbbb
- ### Single, Single,Multiple
- ### Zero More
<br>

### 5 + (Plus)        
- ### ab+  --> ab abb abbb abbbbbbb
- ### Single,Multiple
- ### Once More
<br>

### 6 ? (Question Mark)
- ###  ab? --> a a ab ab ab ab
- ### Single, Single,Single
- ### Zero Once
<br>

### 7 \ (Backslash)
- ### Escapes special characters

<br>

### 8 | (Pipe)
- ### a|b --> a b a b b a
<br>

### 9 {} (curly brackets):
- ### Specifies the exact number of occurrences or a range
<br>

### 10 () (parentheses)
- ### Groups characters together and captures the matched substring.
<br>

### 11 [] (square brackets)
 - ### Specifies a character class, matches any single character within the brackets.


<br><br>


- ### . (Dot)
- ### Matches any character except a newline character.

In [1]:
import re

In [2]:
string = "Mubeen Mubii Mubxx"

pattern = re.compile(r"Mub..")
pattern.findall(string) # extract ALL Mub and 2 .. anything else

['Mubee', 'Mubii', 'Mubxx']

In [3]:
# example 2

txt = "mubeen@gmail_pk ali@gmail_ru"

pattern = re.compile(r"gmail_..") # find gmail_pk and gmail_ru

pattern.findall(txt)

['gmail_pk', 'gmail_ru']

# ^ (Start With)
<br>
 
- ### The caret (^) acts as an anchor in regular expressions. 
- ### When it is placed at the beginning of a pattern, it matches the start of a string. 
- ### This means that the pattern must appear at the very beginning of the string for a match to occur.

In [4]:
string = "Ali Ali"
pattern = re.compile(r"^Ali")

pattern.findall(string)

#To extract the name "Ali" only if it appears at the start of a string

['Ali']

In [5]:
# example 2

string = "Ali Ali\nAli"
pattern = re.compile(r"^Ali")

pattern.findall(string)

# now i use \n so why third Ali are not extract

['Ali']

In [6]:
# use re.MULTILINE

string = "Ali Ali\nAli"
pattern = re.compile(r"^Ali",re.MULTILINE)
pattern.findall(string)


['Ali', 'Ali']

<br>

- ### $ (Dollar)
- ### Matches the end of a string or the end of a line.

In [7]:
string = "Ali Ali\nAli"
pattern = re.compile(r"Ali$") # last Ali extract

pattern.findall(string)


['Ali']

In [8]:
string = "Ali Ali\nAli"
pattern = re.compile(r"Ali$",re.M) # Extract 2nd Ali and end Ali

pattern.findall(string)


['Ali', 'Ali']


- ### * (Star)
- ### Matches zero or more occurrences of the preceding character or group.

In [9]:
string = "aa abbb abb abbbb abc"
pattern = re.compile(r"ab*") # Extract all Single a and all multiple b
pattern.findall(string)



['a', 'a', 'abbb', 'abb', 'abbbb', 'ab']

<br>

- ### + (Plus)
- ### Matches one or more occurrences of the preceding character or group.

In [10]:
string = "aa abbb abb abbbb abc"
pattern = re.compile(r"ab+") # Extract a With b
pattern.findall(string)



['abbb', 'abb', 'abbbb', 'ab']

<br>

- ### ? (Question Mark)
- ### Matches zero or one occurrence of the preceding character or group.

In [11]:
string = "aa abbb abb abbbb abc"
pattern = re.compile(r"ab?") # Extract a With b

pattern.findall(string)

['a', 'a', 'ab', 'ab', 'ab', 'ab']

<br>

- ### \ (Backslash)
- ### Escapes special characters, allowing you to match them literally.

In [12]:
text = "01234qbcd5678"

pattern = re.compile(r"\d") # \d means digits

pattern.findall(text)

['0', '1', '2', '3', '4', '5', '6', '7', '8']

In [13]:
pattern = re.compile(r"\D") # \D means alphabets

pattern.findall(text)

['q', 'b', 'c', 'd']

<br>

- ### | (Pipe)
- ### Acts as an OR operator, matching either the expression before or after it.

In [14]:
text = "Mubeen Ali Ahmad Rizwan Ali"

pattern = re.compile(r"Mubeen|Ali") # extract all Mubeen or Ali
pattern.findall(text)

['Mubeen', 'Ali', 'Ali']

<br>

- ### {} (Curly Braces)
- ### Specifies the exact number of occurrences or a range of occurrences of the preceding character or group.

In [15]:
text = "Mubeeen Mubeen Muben"

pattern = re.compile(r"Mube{2}") # Extract Mube{2} means all in which include 2 ee
pattern.findall(text)

['Mubee', 'Mubee']

In [16]:
# Example 2

text = "Hellooo"
pattern = re.compile(r"o{3}")

pattern.findall(text)

['ooo']

In [17]:
# Example 3

text = "Helloooo"
pattern = re.compile(r"o{3}")

pattern.findall(text)

['ooo']

In [18]:
# Example 4

text = "Hellooooo"    # ooo oo 
pattern = re.compile(r"o{3}")

pattern.findall(text)

['ooo']

In [19]:
# Example 5

text = "Helloooooo"    # ooo ooo 
pattern = re.compile(r"o{3}")

pattern.findall(text)

['ooo', 'ooo']

<br>

- ### () (parentheses)
- ### Groups characters together and captures the matched substring.

In [20]:
# example 1 

text = "Today's date is 28-02-2023"

pattern = re.compile(r"\d{2}-\d{2}")  #28-02 without group

pattern.findall(text)


['28-02']

In [21]:
# example 3 use group ()

text = "Today's date is 28-02-2023"

pattern = re.compile(r"(\d{2})-(\d{2})")  #28 , 02 with group 

pattern.findall(text)


[('28', '02')]

In [22]:
# example 4

text = "Today's date is 28-02-2023"

pattern = re.compile(r"(\d{2})-(\d{2})-(\d{4})")  #28 , 02  2023with group 

pattern.findall(text)


[('28', '02', '2023')]

<br> 

- ### [] (square brackets)

- ### Specifies a character class, matches any single character within the brackets.

In [34]:
text = "Hello World"

pattern = re.compile(r"[aeiou]")

pattern.findall(text)

['e', 'o', 'o']

In [35]:
# example 2

text = "hello0123World"
pattern = re.compile(r"[0-9]")
pattern.findall(text)

['0', '1', '2', '3']