## `re.split(pattern, string, maxsplit=0, flags=0)`

Split *string* by the occurrences of *pattern*. 
* If capturing parentheses are used in *pattern*, then the text of all groups in the pattern are also returned as part of the resulting list. 
* If *maxsplit* is nonzero, at most *maxsplit* splits occur, and the remainder of the string is returned as the final element of the list.

In [1]:
import re

In [6]:
phrase1 = '...Words, words.'
phrase2 = 'Words, words.'

In [8]:
re.split(r'\W+', phrase1)

['', 'Words', 'words', '']

In [9]:
re.split(r'\W+', phrase2)

['Words', 'words', '']

### Capturing groups
If there are capturing groups in the separator and it matches at the start of the string, the result will start with an empty string. The same holds for the end of the string:

That way, separator components are always found at the same relative indices within the result list.

In [11]:
re.split(r'(\W+)', phrase1)

['', '...', 'Words', ', ', 'words', '.', '']

In [12]:
re.split(r'(\W+)', phrase2)

['Words', ', ', 'words', '.', '']

### `\b`
Matches the empty string, but only at the beginning or end of a word. A word is defined as a sequence of word characters. Note that formally, `\b` is defined as the boundary between a `\w` and a `\W` character (or vice versa), or between `\w` and the beginning/end of the string. This means that `r'\bfoo\b'` matches `'foo'`, `'foo.'`, `'(foo)'`, `'bar foo baz'` but not `'foobar'` or `'foo3'`.

In [13]:
re.split(r'\b', phrase1)

['...', 'Words', ', ', 'words', '.']

In [19]:
re.split(r'(\b)', phrase1)

['...', '', 'Words', '', ', ', '', 'words', '', '.']

In [20]:
re.split(r'\b', phrase2)

['', 'Words', ', ', 'words', '.']

In [21]:
re.split(r'(\b)', phrase2)

['', '', 'Words', '', ', ', '', 'words', '', '.']

Empty matches for the pattern split the string only when not adjacent to a previous empty match.

In [15]:
re.split(r'\W*', phrase1)

['', '', 'W', 'o', 'r', 'd', 's', '', 'w', 'o', 'r', 'd', 's', '', '']

In [16]:
re.split(r'\W*', phrase2)

['', 'W', 'o', 'r', 'd', 's', '', 'w', 'o', 'r', 'd', 's', '', '']

In [24]:
re.split(r'\W*', '...words...')

['', '', 'w', 'o', 'r', 'd', 's', '', '']

In [25]:
re.split(r'(\W*)', '...words...')

['', '...', '', '', 'w', '', 'o', '', 'r', '', 'd', '', 's', '...', '', '', '']