# `re.findall()`   
## Non-overlapping matches, groups and alternatives

Search for all non-overlapping matches with `re.findall()`. Umlauts work as expected.

In [1]:
import re
text = 'Viele Köche verderben den Brei.'
pattern = r'\w+'
print(re.findall(pattern, text))

['Viele', 'Köche', 'verderben', 'den', 'Brei']


## Alternatives are _not_ commutative
The order of Regex alternatives is relevant!

In [2]:
re.findall(r'a|aa',"Saal")

['a', 'a']

In [3]:
re.findall(r'aa|a',"Saal")

['aa']

## (Non-)referenceable groups
Round brackets result in referenceable groups

In [4]:
text = 'Blick-Leser, A-Post-Fans und andere Bindestrich-Komposita'
re.sub(r'(\w+-)+(\w+)', r'\2', text)

'Leser, Fans und andere Komposita'

### Unreferenceable groups: (?: REGEX)
`?:` makes the groups unreferenceable


In [5]:
re.sub(r'(?:\w+-)+(\w+)', r'\1', text)

'Leser, Fans und andere Komposita'

## Grouping  changes the return value of `re.findall()`

* **Without referencing groups**:  List of string matches

In [6]:
re.findall(r'ah|aa', "kahler Saal")

['ah', 'aa']

* **With referencing groups**:  List of tuples of strings, where the _i_th element contains the matched content of the _i_th grouping brackets.

In [7]:
re.findall(r'a(h)|a(a)', "kahler Saal")

[('h', ''), ('', 'a')]

How is that to be interpreted?

    kahler Saal
       |      |
     ('h',    '')  # 1. match
     ('',    'a')  # 2. match

- Without referenceable groups: List of matches

In [8]:
re.findall(r'a(?:h)|a(?:a)', "kahler Saal")

['ah', 'aa']