# Repeating Characters

Regular expressions are useful because we can specify patterns with some uncertainty. For instance, we can write a pattern such that a character in it is a numerical digit, which allows it to be 1 out of 10 characters.

Furthermore, as you will see today, regex allows us to define a pattern such that we can leave some uncertainty about **how many times** each character occurs in a text for a match to occur.

By default, every character in a regex pattern matches just 1 character. Today's lesson is all about creating regex patterns that match 0 or more, 1 or more, 0 or 1, a specific number of, or a specific range of occurences of a character.

In [1]:
import re

## Repeating characters

The pattern `abc` will match the text `abc`. What if we want to match `abbc` and `abbbbbbbbbc` as well?

To do this, we need to tell regex that the `b` in the pattern can occur 1 or more times. To do this, we add an operator **after** the character `b`. The pattern then becomes `ab+c`, where `+` is the operator.

### Zero or more occurrences

`*` is a special character in a regex pattern that modifies a character to occur **zero or more times**.

In [2]:
pattern = re.compile("hey*")
print(pattern.findall("hey"))
print(pattern.findall("he"))
print(pattern.findall("heyyyyyyyyyyy"))
print(pattern.findall("heyyy.yyyyyyyy"))

['hey']
['he']
['heyyyyyyyyyyy']
['heyyy']


### One or more occurrences

`+` is a special character in a regex pattern that matches a character **one or more times**.

In [3]:
pattern = re.compile("hey+")
print(pattern.findall("hey"))
print(pattern.findall("he"))
print(pattern.findall("heyyyyyyyyyyy"))
print(pattern.findall("heyyy.yyyyyyyy"))

['hey']
[]
['heyyyyyyyyyyy']
['heyyy']


### Zero or one occurences

`?` is a special character that matches a character zero or one times, basically making them optional in matching operations.

In [4]:
pattern = re.compile("coder?")
print(pattern.findall("code"))
print(pattern.findall("coder"))
print(pattern.findall("coderr"))
print(pattern.findall("ccoderr"))
print(pattern.findall("code."))

['code']
['coder']
['coder']
['coder']
['code']


### Exact number of occurences

The syntax `{m}` indicates that the character modified needs to occur exactly `m` times to be matched.

In [5]:
pattern = re.compile("wate{4}r")
print(pattern.findall("wateeer"))
print(pattern.findall("wateeeer"))
print(pattern.findall("wateeeeer"))

[]
['wateeeer']
[]


### Range of occurences

A modifier with the syntax `{min,max}` allows us to specify a range of times a character/set can occur for a text to match.

For example, `{3,6}` indicates the character/character set modified can be repeated 3, 4, 5, or 6 times.

Parts of the modifier can be omitted:
* `{min,}` means `max` is infinity
* `{,max}` means `min` is 0

In [6]:
pattern = re.compile("wa{2,4}ter")
print(pattern.findall("water"))
print(pattern.findall("waater"))
print(pattern.findall("waaater"))
print(pattern.findall("waaaater"))
print(pattern.findall("waaaaater"))

[]
['waater']
['waaater']
['waaaater']
[]


In [7]:
pattern = re.compile("wat{3,}er")
print(pattern.findall("watter"))
print(pattern.findall("wattter"))
print(pattern.findall("wattttttttttttttttttttttttttttttttttttttter"))

[]
['wattter']
['wattttttttttttttttttttttttttttttttttttttter']


In [8]:
pattern = re.compile("wat{,3}er")
print(pattern.findall("waer"))
print(pattern.findall("water"))
print(pattern.findall("watter"))
print(pattern.findall("wattter"))
print(pattern.findall("watttter"))

['waer']
['water']
['watter']
['wattter']
[]


### Repeating character classes

In addition to repeating individual characters, we can also use regex to repeat characters that belong to a class.

In [9]:
pattern = re.compile("[a-zA-Z]+")
print(pattern.fullmatch("FutureProgrammer"))
print(pattern.fullmatch("FutureProgrammer360"))

<re.Match object; span=(0, 16), match='FutureProgrammer'>
None


## Summary

That will be all for today's lesson on repetition modifiers. Now you can use regexes to match texts of varying lengths. This greatly increases the power of regular expressions.