# Table of Content
- [2.4 Matching and Searching for Text Patterns](#2.4)
- [2.7 Specifying a Regular Expression for the Shortest Match](#2.7)
- [2.8 Writing a Regular Expression for Multiline Patterns](#2.8)
- [2.14 Combining and Concatenating Strings](#2.14)

---
## <a name="2.4"></a>2.4 Matching and Searching for Text Patterns

### Solution

In [1]:
import re

t1 = '12/25/2016'
t2 = 'Dec 25, 2016'

In [2]:
print(bool(re.match(r'\d+/\d+/\d+', t1)))
print(bool(re.match(r'\d+/\d+/\d+', t2)))

True
False


- If the same pattern is to be performd several times, it usually pays to precompile it.

In [3]:
datepat = re.compile(r'\d+/\d+/\d+')

print(bool(datepat.match(t1)))
print(bool(datepat.match(t2)))

True
False


`match` returns only the first occurence.  
If all occurences are needed, `findall()` should be used.

In [4]:
t3 = '12/25/2016 ------ 12/26/2016 ----- 12/27/2017'

datepat.findall(t3)

['12/25/2016', '12/26/2016', '12/27/2017']

- Matching iteratively: `finditer`

In [5]:
for m in datepat.finditer(t3):
    print(m.group())

12/25/2016
12/26/2016
12/27/2017


- Capteruring re groups

In [6]:
datepat = re.compile(r'(\d+)/(\d+)/(\d+)')

m = datepat.match(t1)
print(m.group(0))
print(m.group(1))
print(m.group(2))
print(m.group(3))
print(m.groups())

12/25/2016
12
25
2016
('12', '25', '2016')


---
## <a name="2.7"></a> 2.7 Specifying a Regular Expression for the Shortest Match

In [7]:
pat = re.compile(r'\"(.*)\"')

s1 = '"First Word" "Second Word"'
pat.findall(s1)

['First Word" "Second Word']

### Solution

In [8]:
pat_non_greedy = re.compile(r'\"(.*?)\"')

pat_non_greedy.findall(s1)

['First Word', 'Second Word']


- \* operator is greedy.  
    - Add the ? modifier after the \* operator to make it non-greedy

---
## <a name="2.8"></a>2.8 Writing a Regular Expression for Multiline Patterns

In [9]:
pat = re.compile(r'/\*(.*?)\*/')

t1 = "/* 123 \n 456 */"
pat.findall(t1)

[]

### Solution

In [10]:
pat_non_capture_group = re.compile(r'/\*((?:.|\n)*?)\*/')
pat_non_capture_group.findall(t1)

[' 123 \n 456 ']

- ***?:*** is used to sepcify non-capture group
    - (i.e. a group for the purpose of matching, but not being captured)

---
## <a name="2.14"></a>2.14 Combining and Concatenating Strings

### Discussion
- Using + to join a lot of strings is inefficient and not recommanded
- Sometimes concatenation is not necessary

In [11]:
a, b, c= "123"

print(a+":"+b+":"+c)          # Worst
print(':'.join([a, b, c]))    # Still bad

print(a, b, c, sep=':')       # Better

1:2:3
1:2:3
1:2:3
