#**Regular Expression in Python**
Regular Expression is also called RE or RegEx. It is a sequence of characters which forms a search pattern. It is used to check whether the search pattern matches in the given string.




# RegEx Module
Python has a built-in package called `re`, which can be used to work with Regular Expressions. Many regex functionalities reside in the module.

[Here](https://docs.python.org/3/library/re.html) is the link to the official documentation on `regex`

In [None]:
import re   #Now you can start using regular expressions

##Example

In [None]:
text = "Use of python in Machine Learning"
x = re.search("^Use.*Learning$", text)
if (x):
   print("YES! We have a match!")
else:
   print("No match")

##RegEX Functions
* **re.findall()**: It returns a list of all matching patterns.
* **re.search()**      : It Returns a Match object if there is a match anywhere in the string.
* **re.split()**       : Returns a list, where the string has been split at each match
* **re.sub()**         : It replaces one or many matches with a string.

## Metacharacters
Here are some metacharacters in RegEx which have special meaning:
* **[]** --  represents the set of charecters.ex:“[a-m]”
* **\** -- It signals a special sequence
* **.** -- Any character except newline character
* **$** -- Ends with
* **^** --Starts With
* "**+**"  -- One or more occurrences
* "* " -- Zero or more occurrences
* **{}** -- Exactly the specified number of occurrences
*  **|** -- 	Either or
* () -- Capture and group


##Special Sequences
Special sequences in RegEx is a \ followed by one of the characters listed below and has a special meaning -
* **\A** --Returns  a match if the specified characters are at the beginning of the string
* **\b** -- 	Returns a match where the specified characters are at the beginning or at the end of a word
* **\B** -- Returns a match if the specified characters are present, but NOT at the start of a word
* **\D** -- Returns a match where the string does not contain digits
* **\d** -- Returns a match if the string contains digits
* **\s** -- Returns a match where the string contains a white space character
* **\S** -- Returns a match where the string does not contain a white space character
*  **\W** -- Returns a match where the string does not contain any word characters
* **\w** -- Returns a match if the string contains any word characters
*  **\Z** -- Returns a match if the specified characters are at the end of the string

## Sets
A set in RegEx is a set of characters inside a pair of square brackets [] having some special meaning.
* **[raj]**: Returns a match where one of the specified characters (a, r, or j) are present
* **[a-n]**: Returns a match for any lower case letter, alphabetically between a and n
* **[^arn]**: Returns a match for any character Except r, a and n
* **[0123]**: Returns a match where any of the specified digits (0, 1, 2, or 3) are present
* **[0-9]**: Returns a match for any digit between 0 and 9
* **[0-7][0-8]**: Returns a match for any two-digit numbers from 00 and 78
* **[a-zA-Z]**:Returns a match for any character alphabetically between a to z or A to Z
* **[+]**:Return a match for any + character in the string


##Example on findall() function:




In [None]:
#findall searches for a match of the entire string but returns a result similar to group method in Match object
s = "black, blue and brown"
pattern = r'bl\w+'
matches = re.findall(pattern,s)

print(matches)

The above pattern matches a literal string 'bl' followed by one or more word characters specified by the \w+ rule. Therefore, the findall() function returns a list of strings that match the whole pattern.

##Example on split() function:

In [None]:
#It returns a list where the string has been split at each match and the syntax for split is re.split(pattern, string)
string = "I felt happy because I saw the others were happy."
x = re.split(" ", string)          #We can choose at what bases we can split the string
print(x)


##Example on search() function:

In [None]:
#The search() function searches the string for a match, and returns a Match object if there is a match.
txt = "Python is one of the most popular languages around the world"
searchObj = re.search("\s", txt)
print("The first white-space character is located in position: ", searchObj.start())     #.start() funtion gives the position

##Example on sub() function:

In [None]:
#Using sub  method we can replace one or more occurrences of a regex pattern in the target string with a substitute string.
phone_no = '(212)-456-7890'
pattern = '\D'
result = re.sub(pattern, '',phone_no)
print(result)    #In this example, the \D is an inverse digit character set that matches any single character which is not a digit.
                 #Therefore, the sub() function replaces all non-digit characters with the empty string ''.

## Some useful regex examples of real-world applications:
**Example 1:** Extract date from string

In [None]:
string = "My email id is abc@gmail.com which i have created on 05/01/2021"    #Now we want to extract date from given string
x = re.findall('[0-9]{1,2}\/[0-9]{1,2}\/[0-9]{4}', string)    #Pattern to extract date

print(x)

# OUTPUT
# ['05/01/2021']

**Example 2**: Extract email id from text

In [None]:
s ='talentsprint@gmail.com1998'
result =re.findall('[a-zA-Z0-9]\S*@\S*[a-zA-Z]', s)
print(result)

**Example 3**: Check Email is valid or not

In [None]:
string = "regularexpression123@gmail.com"
x = re.findall('[a-z0-9]+[\._]?[a-z0-9]+[@]\w+[.]\w{2,3}$', string)

if x:
  print("Valid Email")
else:
  print("Not Valid")

# OUTPUT
# Valid Email

**Example 4**:Check URL is valid or not

In [None]:
string = "https://sohanlalgupta.tech"
x = re.search('((http|https)://)(www.)?[a-zA-Z0-9@:%._\\+~#?&//=]{2,256}\\.[a-z]{2,6}\\b([-a-zA-Z0-9@:%._\\+~#?&//=]*)', string)

if x:
  print("Valid URL")
else:
  print("Not Valid")

# OUTPUT
# Valid URL

##Exercises
* **Question 1**--Extract floating number from the text using python regular expression.
 * **Sample Input:** Sound Level: -11.7 db or 15.2 or 8 db   
 * **Expected Output:** ['-11.7', '15.2', '8']
* **Question 2**--Convert a date of yyyy-mm-dd format to dd-mm-yyyy format
 * **Sample Input:** 2026-01-02
 * **Expected Output:** 02-01-2026
* **Question 3**--Replace maximum 2 occurrences of space, comma, or dot with a colon.
 * **Sample Input**: 'Python Exercises, Java exercises.'
 * **Expected Output**: Python:Exercises: Java exercises.

For more RegEx Examples, refer:
[Important Regular Expressions](https://medium.com/@anindyasdas/important-regular-expressions-def051aa7425)
