# Table of Contents - Regex

- [all word contain 5 chracters](#all-word-contain-5-chracters)
- [check string](#check-string)
- [find substring](#find-substring)
- [keep alphanumeric only](#keep-alphanumeric-only)
- [remove parenthesis](#remove-parenthesis)
- [remove white space](#remove-white-space)
- [remove zero](#remove-zero)

## all word contain 5 chracters

In [None]:
# Problem

# Write a Python program to find all five characters long word in a string.

# Input
# 'The quick brown fox jumps over the lazy dog.'

# Output
# ['quick', 'brown', 'jumps']

In [7]:
# Solution

import re

string = 'The quick brown fox jumps over the lazy dog.'
regex_pattern = r'\b\w{5}\b'
re.findall(regex_pattern, sentence)

['quick', 'brown', 'jumps']

### Some Notes:

- #### word characters (\w) include alphanumeric characters (A-Z, a-z and 0-9) and underscores (_).

- #### `\b` assert position at a word boundary: (^\w | \w$| \W\w | \w\W).

- #### There are three different positions that qualify as word boundaries `\b`:
      - Before the first character in the string, if the first character is a word character. (^\w)
      - After the last character in the string, if the last character is a word character. (\w$)
      - Between two characters in the string, where one is a word character & the other is not a word character 
      (\W\w | \w\W)
      
- ##### Live Demo: [https://regex101.com/r/lebiRJ/1](https://regex101.com/r/lebiRJ/1)

## check string

In [None]:
# Problem

# Write a Python program to check that a string contains only a certain set of characters (in this case a-z, A-Z and 0-9).

# Input
# "ABCDEFabcdef123450"
# "*&%@#!}{"

# Output
# True                                                                                                          
# False

In [11]:
# Solution

import re

def match_charset(string):
    pattern = r'[a-zA-Z0-9]+'
    return bool(re.findall(pattern, string))

print(match_charset("ABCDEFabcdef123450"))
print(match_charset("*&%@#!}{"))

True
False


## find substring

In [None]:
# Problem

# Write a Python program to find the occurrence and position of the substrings within a string.
# 
# Input
# text = 'Python exercises, PHP exercises, C# exercises'
# pattern = 'exercises'
# 
# Output
# Found "exercises" at 7:16                                                                                     
# Found "exercises" at 22:31                                                                                    
# Found "exercises" at 36:45

In [18]:
# Solution

import re

def find_substring(text, pattern):
    for match in re.finditer(pattern, text):
        s = match.start()
        e = match.end()
        print('Found "{}" at {}:{}'.format(pattern, s, e))
        
find_substring('Python exercises, PHP exercises, C# exercises', 'exercises')

Found "exercises" at 7:16
Found "exercises" at 22:31
Found "exercises" at 36:45


## keep alphanumeric only

In [None]:
# Problem

# Write a Python program to remove everything except alphanumeric characters from a string.

# Input
# '**//Python Exercises// - 12. '

# Output
# PythonExercises12

In [28]:
# Solution1

import re

def keep_alphanumeric(string):
    pattern = r'[a-zA-Z0-9]'
    return ''.join(re.findall(pattern, string))

print(keep_alphanumeric('**//Python Exercises// - 12. '))
print(keep_alphanumeric('534@!345#$keep%^ -*123'))

PythonExercises12
534345keep123


In [29]:
# Solution2

import re

text1 = '**//Python Exercises// - 12. '
pattern = re.compile('[\W_]+')      # re.compile() will compile a regular expression pattern, returning a Pattern object.
print(pattern.sub('', text1))       # Return the string obtained by replacing the leftmost non-overlapping occurrences 
                                    # of pattern in string by the replacement repl.

PythonExercises12


## remove parenthesis

In [None]:
# Problem

# Write a Python program to remove the parenthesis area in a string.
# 
# Input
# ["example (.com)", "w3resource", "github (.com)", "stackoverflow (.com)"]
# 
# Output
# example                                                                                                       
# w3resource                                                                                                    
# github                                                                                                        
# stackoverflow 

In [None]:
# Solution



## remove white space

In [None]:
# Problem

# Remove all whitespaces from a string
# 
# Input
# ' Python    Exercises '
# Output
# PythonExercises

In [None]:
# Solution



## remove zero

In [None]:
# Problem

# Write a Python program to remove leading zeros from an IP address.

# Input
# "216.08.094.196"

# Output
# 216.8.94.196

In [None]:
# Solution



### [Move to Top](#Table-of-Contents---Regex)

### <center> The End </center>