## Regular Expression

A regular expression, often called a pattern, is **an expression used to specify a set of strings** required for a particular purpose. 

- A simple way to specify a finite set of strings is to list its elements or members. <br>For example `{Doc1,1,Doc2,2,Doc3,3}`. 
    

`{Doc1,Doc2,Doc3}` can be specified by the pattern `Doc(1|2|3)`. <br>We say that this pattern matches each of the two strings. [Lets check?](https://regex101.com/)

> In most formalisms, if there exists at least one regular expression that matches a particular set then there exists an infinite number of other regular expressions that also match it, i.e. **the specification is not unique**.<br>
For example, the string set `{Doc1,Doc2,Doc3}` can also be specified by the pattern `Doc\d`.



## Uses of Regular Expressions

**Some important usages of regular expressions are:**

- Check if an input honors a given pattern; for example, we can check whether a value entered in a HTML formulary is a valid e-mail address
> `Maniteja123@gmail.com`

- Look for a pattern appearance in a piece of text; for example, check if either the word "color" or the word "colour" appears in a document with just **one scan**
> `I like Red color and i am wearing a Red colour shirt`

- Extract specific portions of a text; for example, extract the postal code of an address
> `Mr John Smith. 132, My Street, Kingston, New York 12401.`

- Replace portions of text; for example, change the appearance of "color" with "colour"
> `I like Red colour and i am wearing a Red colour shirt`

- Split a larger text into smaller pieces, for example, splitting a text by any appearance of the dot, comma, or newline characters
> `myself person1,you are person2`

# Meta Characters

- All meta characters. `^ $ * + ? { } \ | ( ) `

  1. `.` any character (except new line character)
  2. `^` startswith `^word`
  3. `$` endswith `word$`
  4. `*` zero or more occurrences
  5. `+` one or more occurrences
  6. `{}` exactly specified no of occurrences "M{2}"
  7. `[]` A set of characters "[a-c]"
  8. `\` Signals a special sequence (can also be used to escape special characters) "\d"
  9. `|` Either or "apple|iphone"
  10. `()` Capture and group

#### Some examples for set 
 1. `[arn]` Returns a match where one of the specified characters (a, r, or n) are present
 2. `[a-n]` Returns a match for any lower case character, alphabetically between a and n
 3. `[^arn]` Returns a match for any character EXCEPT a, r, and n
 4. `[0123]` Returns a match where any of the specified digits (0, 1, 2, or 3) are present
 5. `[0-9]` Returns a match for any digit between 0 and 9
 6. `0-5` Returns a match for any two-digit numbers from 00 and 59
 7. `[a-zA-Z]` Returns a match for any character alphabetically between a and z, lower case OR upper      case

# Special Sequences
- A special sequence is a \ followed by one of the characters in the list below, and has a special       meaning:

  1. `\d` : Matches any decimal digit; this is equivalent to the class [0-9].
  2. `\D` : Matches any non-digit character; this is equivalent to the class [^0-9].
  3. `\s` : Matches any whitespace character, next line character(\n) or tab(\t);
  4. `\S` : Matches any non-whitespace character;
  5. `\w` : Matches any alphanumeric (word) character; this is equivalent to the class [a-zA-Z0-9_].
  6. `\W` : Matches any non-alphanumeric character; this is equivalent to the class [^a-zA-Z0-9_].