### findall() Method

Search method used before finds the first match. The findall method will give you all matches from text.
- find all returns a list of strings
- can't use the mo.group()

In [1]:
import re

In [9]:
#example: finding all numbers in a 
phoneRegex = re.compile(r'\d\d\d-\d\d\d-\d\d\d\d')
numbers = phoneRegex.findall('my numbers are 467-333-3333 and 111-333-5555')
numbers

['467-333-3333', '111-333-5555']

### Groups

If have groups inside of regex object and use findall method, it returns list of tuples instead of regular strings. Each one of the tuples is one of the matches and each element of a single tuple is one of the groups.

In [10]:
phoneRegex = re.compile(r'(\d\d\d)-(\d\d\d-\d\d\d\d)')
numbers = phoneRegex.findall('my numbers are 467-333-3333 and 111-333-5555')
numbers

[('467', '333-3333'), ('111', '333-5555')]

### Character Classes

* \d --> numeric digit (0-9)
* \D --> any character that is NOT numeric digit (0-9)
    * numbers
* \w --> any letter, numeric digit or _
* \W --> any character that is NOT letter, number or _
    * words
* \s --> any space, tab, newline character
* \S --> any character that is NOT space, newline or tab
    * spaces

In [12]:
#getting every number followed by word in the song lyrics
lyrics = '12 drummers drumming, 11 pipers piping, 10 lords a leaping, 9 ladies dancing, 8 maids a milking, 7 swans a swimming, 6 geese a laying, 5 golden rings, 4 calling birds, 3 french hens, 2 turtle doves, and 1 partridge in a pear tree'
xmasRegex = re.compile(r'\d+\s\w+')     #\s stands for the space
xmasRegex.findall(lyrics)

['12 drummers',
 '11 pipers',
 '10 lords',
 '9 ladies',
 '8 maids',
 '7 swans',
 '6 geese',
 '5 golden',
 '4 calling',
 '3 french',
 '2 turtle',
 '1 partridge']

### Making your own character classes

The ones seen above are the standard, short cut ones.
- create a new character class by writting between square brackets all characters that would be part of it

In [13]:
vowelRegex = re.compile(r'[aeiouAEIOU]')
vowelRegex.findall('Robocop eats baby fOOd')

['o', 'o', 'o', 'e', 'a', 'a', 'O', 'O']

In [14]:
doubleVowelRegex = re.compile(r'[aeiouAEIOU]{2}')
doubleVowelRegex.findall('Robocop eats baby fOOd')

['ea', 'OO']

### Negative Character Classes

Add a ^ inside the square brackets of new character class means that it matches everything that is NOT inside the brackets.

In [15]:
#finding all consonants and punctuations, etc, or all not vowels
notVowelRegex = re.compile(r'[^aeiouAEIOU]')
notVowelRegex.findall('Robocop eats baby fOOd')

['R', 'b', 'c', 'p', ' ', 't', 's', ' ', 'b', 'b', 'y', ' ', 'f', 'd']