Ch06 - Manipulating Strings
---

In [1]:
print('nice'.ljust(10, '.') + '150000'.rjust(10,'□'))

nice......□□□□150000


In [2]:
def printPicnic(itemsDict, leftWidth, rightWidth):
    print('PICNIC ITEMS'.center(leftWidth + rightWidth, '-'))
    for k, v in itemsDict.items():
        print(k.ljust(leftWidth, '.') + str(v).rjust(rightWidth))
picnicItems = {'sandwiches': 4, 'apples': 12, 'cups': 4, 'cookies': 8000}

In [3]:
printPicnic(picnicItems, 12, 5)

---PICNIC ITEMS--
cups........    4
apples......   12
sandwiches..    4
cookies..... 8000


In [4]:
printPicnic(picnicItems, 20, 6)

-------PICNIC ITEMS-------
cups................     4
apples..............    12
sandwiches..........     4
cookies.............  8000


Ch07 Matching Multiple Groups with the Pipe
---

Say you want to  find a phone number fron Taipei in a string.

Pattern Exmple: 02-211-8800

1. length: 9
- area code check: 02
-  `-`, the  firrst hyphen, after the area code
- three more numeric characters,
- `-`, another hypheny,
- and  nally four more numbers

In [5]:
def isPhoneNumber(text):
    if len(text) != 11:
        return False
    for i in range(0, 2):
        if not text[i].isdecimal():
            return False
    if text[2] != '-':
        return False
    for i in range(3, 6):
        if not text[i].isdecimal():
            return False
    if text[6] != '-':
        return False
    for i in range(7, 11):
        if not text[i].isdecimal():
            return False
    return True

In [6]:
print('087-222222 is a phone number:')
print(isPhoneNumber('087-222222'))

087-222222 is a phone number:
False


In [7]:
print('02-211-8800 is a phone number:')
print(isPhoneNumber('02-211-8800'))

02-211-8800 is a phone number:
True


Find out from a messages
---
1. test for consective strings with length being 11.

In [8]:
message = 'The representive number of campus number is 02-211-8800 and 02-211-8700 is scanner.'
k=1
for i in range(len(message)): 
    chunk = message[i:i+11]
    if isPhoneNumber(chunk):
       print(k,' Phone number found: ' + chunk)
       k=k+1  
print('_+_ Done and ',k-1, ' found')

1  Phone number found: 02-211-8800
2  Phone number found: 02-211-8700
_+_ Done and  2  found


Finding Patterns of text with regular expressions
---
Seem to work fine. However, the stings, '022118800', couldn't be found out:


In [9]:
print(isPhoneNumber('022118800'))

False


`re` module
---
Professional regular expressions module, re, in python does help to such works:

1. Import the regex module with `import re`.
2. Create a Regex object with the `re.compile()` function. (Remember to use araw string.)
3. Pass the string you want to search into the Regex object’s `search()` method. This returns a Match object.
4. Call the Match object’s `group()` method to return a string of the actual matched text.


In [10]:
import re
phoneNumRegex1 = re.compile(r'(\d\d)-(\d\d\d-\d\d\d\d)')
mo = phoneNumRegex1.search('CGU number is 02-211-8800.')
mo.group()

'02-211-8800'

In [11]:
phoneNumRegex2 = re.compile(r'(\(\d\d\))-(\d\d\d-\d\d\d\d)')
mo = phoneNumRegex2.search('CGU tel number is (02)-211-8800')
mo.group()

'(02)-211-8800'

`(\()?, (\))?`: part of the regular expression means that the pattern, `(` or   `)` is an optional group.  

In [12]:
phoneNumRegex3 = re.compile(r'((\()?\d\d(\))?)-(\d\d\d-\d\d\d\d)')
mo = phoneNumRegex3.search('CGU tel number is 02-211-8800')
mo.group()

'02-211-8800'

In [13]:
mo = phoneNumRegex3.search('CGU tel number is (02)-211-8800')
mo.group()

'(02)-211-8800'

Other motifiers
---
- `(...)*`, Matching Zero or More with the Star,
- `(...)+`, Matching One or More with the Plus
- `(...){2,3}`, repeat a specific number of 2 or 3 times here,
- `^, $`, ( called caret or dollar), at beginning or end text.
- `.`, (called dot) wild character
- `|`, or

the findall() method
---

In [15]:
mo = phoneNumRegex3.search('CGU tel number is 02-211-8800 and trnsfer number is (02)-211-8700')
mo.group()

'02-211-8800'

In [16]:
mo = phoneNumRegex3.findall('CGU tel number is 02-211-8800 and trnsfer number is (02)-211-8700')
mo,mo[1],

([('02', '', '', '211-8800'), ('(02)', '(', ')', '211-8700')],
 ('(02)', '(', ')', '211-8700'))

In [17]:
print(mo[1][0],'-',mo[1][3])

(02) - 211-8700


Character Classes
---
```
\d Any numeric digit from 0 to 9.
\D Any character that is not a numeric digit from 0 to 9.
\w Any letter, numeric digit, or the underscore character. 
\W Any character that is not a letter, numeric digit, or the underscore character
\s Any space, tab, or newline character.
\S Any character that is not a space, tab, or newline.

```

In [18]:
import pyperclip, re

In [19]:
phoneRegex = re.compile(r'''(
    (\d{3}|\(\d{3}\))?                # area code, 2 digits
    (\s|-|\.)?                        # separator
    (\d{3})                           # first 3 digits
    (\s|-|\.)                         # separator
    (\d{4})                           # last 4 digits
    (\s*(ext|x|ext.)\s*(\d{2,5}))?    # extension
    )''', re.VERBOSE)

In [20]:
text='CGU tel number is 02-211-8800 and trnsfer number is (02)-211-8700'
matches = []
for groups in phoneRegex.findall(text):
    phoneNum = '-'.join([groups[1], groups[3], groups[5]])
    if groups[8] != '':
        phoneNum += ' x' + groups[8]
    matches.append(phoneNum)

In [22]:
# Copy results to the clipboard.
if len(matches) > 0:
    pyperclip.copy('\n'.join(matches)) # get one string at once
    print('Copied to clipboard:')
    print('\n'.join(matches))          # get all strings
else:
    print('No phone numbers or email addresses found.')

Copied to clipboard:
-211-8800
-211-8700


Passwords Design
---
Recently, app providers always ask customs or users to strengthen theirs passwords for security. But how does the authentication work?

In [23]:
import string

In [24]:
string.ascii_uppercase 

'ABCDEFGHIJKLMNOPQRSTUVWXYZ'

In [25]:
string.ascii_lowercase 

'abcdefghijklmnopqrstuvwxyz'

Requirement
---
at least 1 digits, one upeercase alphbet, one lower alphbet, and one punctuation. 

In [26]:
def is_strong_password(s):
    # define 8-digit strengthened passwords
    lenth_regex = re.compile(r'.{8,}')
    upper_regex = re.compile(r'[ABCDEFGHIJKLMNOPQRSTUVWXYZ]')
    lower_regex = re.compile(r'[abcdefghijklmnopqrstuvwxyz]')
    digit_regex = re.compile(r'[0123456789]')
    punctuation_regex = re.compile(r'[!"#$%&\'()*+,-./:;<=>?@[\]^_`{|}~]')
    if lenth_regex.search(s) and \
        upper_regex.search(s) and \
        lower_regex.search(s) and \
        digit_regex.search(s) and \
        punctuation_regex.search(s):
            return 'Welcome!'
    return False

In [27]:
passwd='0Us#dwq_'
is_strong_password(passwd)

'Welcome!'