# Re Match vs Search

## Match 

The re.match function is used to match a regular expression pattern to the beginning of a string.

If zero or more characters at the beginning of string match the regular expression pattern, return a corresponding MatchObject instance. Return None if the string does not match the pattern; note that this is different from a zero-length match.

If you want to locate a match anywhere in string, use search() instead.

## Search

The re.search function is used to search a string for the presence of a pattern.

## Equivalences

In Regex, the `^` special characters means at the beginning of the string. Then:

`re.match('pattern')` can be equivalent to `re.search('^pattern')` in many cases.

## Differences

The differences are:

* `re.match('pattern')` is slighty faster than `re.search('^pattern')`. 
* `re.search('^pattern')` has the ability to search at the beginning of each line in a multiline string.

## Speed

Following test is adapter from user [nosklo](https://stackoverflow.com/users/17160/nosklo) from [stackoverflow](https://stackoverflow.com/questions/180986/what-is-the-difference-between-re-search-and-re-match/49710946#49710946)

In [1]:
import re
import random
import string

LENGTH = 10
LIST_SIZE = 100  # TODO, increase size


def generate_word():
    word = [random.choice(string.ascii_lowercase) for _ in range(LENGTH)]
    word = "".join(word) + ' '
    return word


wordlist = [generate_word() for number in range(LIST_SIZE)]
wordlist[2] = 'python'
paragraph = ' '.join(wordlist)

In [2]:
print(paragraph[:100])

aumkgatqmf  jzynobaxjp  python rlnoawvwvr  jhfhcmvzmc  qbuaquffxb  ozltnvsuqi  wmdxzbcjgb  dsbhjdomq


In [14]:
%%timeit
re_match_1 = [re.match('python', word) for word in wordlist]

122 µs ± 1.32 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)


In [13]:
%%timeit
re_search = [re.search('^python', word) for word in wordlist]

134 µs ± 1.81 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)


In [15]:
re_match = [re.match("python", word) for word in wordlist]
re_search_1 = [re.search('^python', word) for word in wordlist]
re_search_2 = [re.search('^python', word) for word in wordlist]

In [21]:
re_match_1[:4]

[None, None, <re.Match object; span=(0, 6), match='python'>, None]

In [20]:
re_search[:4]

[None, None, <re.Match object; span=(0, 6), match='python'>, None]

In [32]:
print(re.match('(.*?)cat', 'A cat is not a dog.'))

<re.Match object; span=(0, 5), match='A cat'>


In [29]:
print(re.search('cat', 'A cat is not a dog.'))

<re.Match object; span=(2, 5), match='cat'>


In [28]:
print(re.match('cat', 'A cat is not a dog.'))

None


## Multiline

Re.match doesn't accept the multiline option.

`re.M`  
`re.MULTILINE`

    When specified, the pattern character '^' matches at the beginning of the string and at the beginning of each line (immediately following each newline); and the pattern character '\$' matches at the end of the string and at the end of each line (immediately preceding each newline). By default, '^' matches only at the beginning of the string, and '$' only at the end of the string and immediately before the newline (if any) at the end of the string.


In [45]:
STRING = "A\ncat is not a dog"
print(STRING)

A
cat is not a dog


In [43]:
print(re.match('cat', STRING, re.MULTILINE))

None


In [44]:
print(re.search('^cat', STRING, re.MULTILINE))

<re.Match object; span=(2, 5), match='cat'>


In [47]:
print(re.search('^cat', STRING))

None
