Basics of Python
===
1. [Data type](#1.-Data-Type)<br>
  a. [1.1 For Human or Machines](#1.1-For-Human-or-Machines)<br>
  b. [1.2 Example, ASCII table's Layout](#1.2-Example,-ASCII-table's-Layout)<br>
-  [String Manipulating](#2.-String-Manipulating)
- [Regular expression Module, re](#3.-Matching-Multiple-Groups-with-the-Pipe) <br>
  a. [Whether the phone number in Taipei is](#Find-out-from-a-messages)<br>
  b. [re Module](#re-module)<br>
  c. [Passwords Enchancement](#Passwords-Design)<br>

# Basics of Basics

## Basic Syntax
  * arithmatic operators, 數學運算: 
  <code style="color:brown">+, -, *, /, **, %, ==, ... </code>
  * variables, 設定變數:
  <pre style="color:brown">
  x=...
  </pre>
  * define function, 定義函數:
  <pre style="color:brown">
  def func(arg1,arg2,...,argn="val"):
      ...
      return result
  </pre>    
  * conditional statement, 條件式:
  <pre style="color:brown">
    if cond1: 
          ...
       elif cond2:
          ...
       else:
          ...
   </pre>
  * while ...:
  * for ...

Example, Dictionary data
---
Not easy to read the dict-type data, use python to finalize the data:

```
Dict Type:
   data={key1:value1, key2:val2,...}
-> data[key1] = value1
```
Python solution:
```Python
   data = (dict type ...)
   for item in data:
       print('[+] ' + item + ':' + data[item])
                       ꜛ              ꜛ   
                      key           value
 ```    

In [3]:
book_info={'Producer': 'Adobe PDF Library 10.00', 
           'CreationDate': "D:20121106094325+05'30'",  
           'Title': 'Network Traffic Analysis with Python', 
           'Creator': 'Elsevier',
           'Subject': 'Violent Python, First Edition (2013) 125-169. doi:10.1016/B978-1-59-749957-6.00004-1'}

In [4]:
for item in book_info:
    print('[+] ' + item + ':' + book_info[item])

[+] Creator:Elsevier
[+] Producer:Adobe PDF Library 10.00
[+] CreationDate:D:20121106094325+05'30'
[+] Subject:Violent Python, First Edition (2013) 125-169. doi:10.1016/B978-1-59-749957-6.00004-1
[+] Title:Network Traffic Analysis with Python


Text Strings
---
Text is the most familiar type of data to most readers, so we’ll begin with some of the powerful features of text strings in Python.

Unicode
---
萬國碼

All of the text examples in this book thus far have been plain old **ASCII**. ASCII was defined in the 1960s, when computers were the size of refrigerators and only slightly better at performing computations. The basic unit of computer storage is the byte, which can store 256 unique values in its eight bits. For various reasons, ASCII only used 7 bits (128 unique values): 26 uppercase letters, 26 lowercase letters, 10 digits, some punctuation symbols, some spacing characters, and some *nonprinting control codes*.

Unfortunately, the world has more letters than ASCII provides. You could have a hot dog at a diner, but never a Gewürztraminer1 at a café. Many attempts have been made to add more letters and symbols, and you’ll see them at times. Just a couple of those include:
- Latin-1, or called ISO 8859-1
- Windows code page 1252

Each of these uses all eight bits, but even that’s not enough, especially when you need non-European languages. <b><font color="red">Unicode</font></b> is an ongoing international standard to define the characters of all the world’s languages, plus symbols from mathematics and other fields.

In [77]:
msg='I have a dream,'
# print out the message and its type
print(msg,type(msg))

I have a dream, <class 'str'>


In [78]:
msg_bytes=msg.encode('latin-1')
# print the string in ASCII
print(msg_bytes,type(msg_bytes))

b'I have a dream,' <class 'bytes'>


In [28]:
a='a'
a_bytes=a.encode('latin-1')
print(ord(a),a_bytes,type(a_bytes))

97 b'a' <class 'bytes'>


In [29]:
a='a'
a_bytes=a.encode('utf-8')
print(ord(a),a_bytes,type(a_bytes))

97 b'a' <class 'bytes'>


In [2]:
uber='ü'
print(uber,type(uber))

ü <class 'str'>


In [30]:
uber_bytes=uber.encode('latin-1')
print(ord(uber),uber_bytes)

252 b'\xfc'


```252 > 128``` which means it is ouside the ASCII table! And ```\xfc=16*15+12=252```.

In [31]:
uber='ü'
uber_bytes=uber.encode('utf-8')
print(ord(uber),uber_bytes)

252 b'\xc3\xbc'


In [24]:
a.encode('ascii', 'backslashreplace')

b'\\xfc'

In [3]:
# Let's input the Traditional Chinese Character
b='卦包'
# What we input what we see
print(b,type(b))

卦包 <class 'str'>


In [4]:
# whether it could be encoded in ASCII or not
b_bytes=b.encode('latin-1')
b_bytes,type(b_bytes)

UnicodeEncodeError: 'latin-1' codec can't encode characters in position 0-1: ordinal not in range(256)

<font color="brown">What's wrong here?</font> As system's response, wrong encoding! 

In [80]:
# encoded in utf-8 again
b_bytes=b.encode('utf-8')
print(b_bytes,type(b_bytes))

b'\xe5\x8d\xa6\xe5\x8c\x85' <class 'bytes'>


In [82]:
# decode back to string
print(b_bytes.decode('utf-8'))

卦包


In [12]:
# Hexidecimal codes of above
b.encode('ascii', 'backslashreplace')

b'\\u5366\\u5305'

In [13]:
# Decimal Codes of above
b.encode('ascii', 'xmlcharrefreplace')

b'&#21350;&#21253;'

Relation between above
---
<img src="imgs/fonttable.png" width=60% />

Mathematical Time
---
Why <pre>U 5366 = 21350</pre>
<b>Answer</b><br>
<pre style="color:brown">
6 + 6*16 + 3*16*16 + 5*16*16*16 = 21350
</pre>

In [18]:
# check it out
6 + 6*16 + 3*16**2 + 5*16**3

21350

1. Data Type
===
Formally, we read and use the symbols in ASCII code as listed

<code style="color:brown;font-size:2em;">
! " # $ % & ' ( ) * + , - . /  0
1 2 3 4 5 6 7 8 9 : ; < = > ?  @
A B C D E F G H I J K L M N O  P
Q R S T U V W X Y Z [ \ ] ^ _  `
a b c d e f g h i j k l m n o  p
q r s t u v w x y z { | } ~ 
</code>

With the ordered keymap, we can work with computer with keyboard.

1.1 For Human or Machines
---
- **Human read**: ```a, b, c, ...```
- **ASCII order**: ```97, 98, 99, ...```
- **Machines read**: ```0x61, 0x62, 0x63, ...```

In [83]:
# How comes '0x61'
6*16+1==97

True

In [84]:
print(ord('a'))

97


In [85]:
# order of character
codes=['a','b','c','d']
for c in codes:
    print(ord(c))


97
98
99
100


But machines read Hexi-code (十六進位碼, 或 binary code 機器碼):

In [86]:
print(hex(97))

0x61


In [160]:
for c in codes:
    st=hex(ord(c))
    print(st)
    #print(str(st))

0x61
0x62
0x63
0x64


**In Brief**,
<code style="color:red;">
              ord(i)   
    Human     ---&gt;    ASCII 
             &lt;---
       ^      chr(i)    |
       |                | hex(i)
       |                V
       .   --------  Machine
           print(hex)
</code>

In [162]:
# Pretty layout by format()
print("{:3}".format("  i"),"{:6}".format('hex(i)'),"{}".format('chr(i)'))
for i in range(97,101):
    print("{:3}".format(i),"{:6}".format(' '+hex(i)),"{}".format('  '+chr(i)))
    #print(i,hex(i),chr(i))

  i hex(i) chr(i)
 97  0x61    a
 98  0x62    b
 99  0x63    c
100  0x64    d


1.2 Example, ASCII table's Layout
---
Make table for the first 128 ASCII characters:

In [25]:
# ASCII Table for 0 - 128
for i in range(16):
    print('',hex(i),end=' ')
for i in range(256):
    if i%16==15:
       print(' ',chr(i))
    else:
       #print(chr(i),end=' ') 
       print(' ',"{:2}".format(chr(i)),end=' ')

 0x0  0x1  0x2  0x3  0x4  0x5  0x6  0x7  0x8  0x9  0xa  0xb  0xc  0xd  0xe  0xf                                        	    
                    
                                                              
       !    "    #    $    %    &    '    (    )    *    +    ,    -    .    /
  0    1    2    3    4    5    6    7    8    9    :    ;    <    =    >    ?
  @    A    B    C    D    E    F    G    H    I    J    K    L    M    N    O
  P    Q    R    S    T    U    V    W    X    Y    Z    [    \    ]    ^    _
  `    a    b    c    d    e    f    g    h    i    j    k    l    m    n    o
  p    q    r    s    t    u    v    w    x    y    z    {    |    }    ~    
                                                              
                                                              
       ¡    ¢    £    ¤    ¥    ¦    §    ¨    ©    ª    «    ¬    ­    ®    ¯
  °    ±    ²    ³    ´    µ    ¶    ·    ¸    ¹   

In [32]:
for i in range(16):
    j=16*i
    print(bytes(range(j,j+16)))


b'\x00\x01\x02\x03\x04\x05\x06\x07\x08\t\n\x0b\x0c\r\x0e\x0f'
b'\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f'
b' !"#$%&\'()*+,-./'
b'0123456789:;<=>?'
b'@ABCDEFGHIJKLMNO'
b'PQRSTUVWXYZ[\\]^_'
b'`abcdefghijklmno'
b'pqrstuvwxyz{|}~\x7f'
b'\x80\x81\x82\x83\x84\x85\x86\x87\x88\x89\x8a\x8b\x8c\x8d\x8e\x8f'
b'\x90\x91\x92\x93\x94\x95\x96\x97\x98\x99\x9a\x9b\x9c\x9d\x9e\x9f'
b'\xa0\xa1\xa2\xa3\xa4\xa5\xa6\xa7\xa8\xa9\xaa\xab\xac\xad\xae\xaf'
b'\xb0\xb1\xb2\xb3\xb4\xb5\xb6\xb7\xb8\xb9\xba\xbb\xbc\xbd\xbe\xbf'
b'\xc0\xc1\xc2\xc3\xc4\xc5\xc6\xc7\xc8\xc9\xca\xcb\xcc\xcd\xce\xcf'
b'\xd0\xd1\xd2\xd3\xd4\xd5\xd6\xd7\xd8\xd9\xda\xdb\xdc\xdd\xde\xdf'
b'\xe0\xe1\xe2\xe3\xe4\xe5\xe6\xe7\xe8\xe9\xea\xeb\xec\xed\xee\xef'
b'\xf0\xf1\xf2\xf3\xf4\xf5\xf6\xf7\xf8\xf9\xfa\xfb\xfc\xfd\xfe\xff'


TCP/IP address
---
All the devices live on the internet should be assigned a 4-numeric address, each ranges from 0 to 255, like
<code>
127.0.0.1
</code>

In [41]:
import socket
s = '127.0.0.1'
dst = socket.inet_aton(s)
dst

b'\x7f\x00\x00\x01'

In [161]:
# nothing print out?
print('\x01')




Questions
---
1. What's the relation between <code>'\x7f\x00\x00\x01'</code> and <code>127.0.0.1</code>?
2. Some found a string within his log file with <code>\x16\x03\x01\x01\x1e\x01</code>. Try to help him to translate this string. 

In [5]:
import socket
s = '163.25.114.1'
dst = socket.inet_aton(s)
dst

b'\xa3\x19r\x01'

In [3]:
# ** help on-line **
# bytes giving packed 32-bit IP representation
?socket.inet_aton

In [35]:
socket.inet_ntoa(dst)

'163.25.114.1'

Automatic the IP Conversion
---

<code style="color:brown;font-size:1.2em;">
\xa3\x19r\x01    --->    163.25.114.1
                                               output
\xa3   ---> a (10) x 16 + 3   --->  163 + '.'   --->   163.
\x19   --->     1  x 16 + 9   --->   25 + '.'   --->       25.
r      --->   ord(r) = 114    --->  114 + '.'   --->          114.
\x01   --->               1   --->    1 + '.'   --->              1

</code>
--->  **163.25.114.1**

In [34]:
from struct import pack

In [35]:
def my_ntoa ( dst,sep='.' ):
    #import re
    hexs = ''
    for ch in dst:
        if ch!=dst[-1]:
           hexs += str(ord(ch)) + sep
           #hexs = re.sub('\.$', '', hexs) 
        else:
           hexs += str(ord(ch)) 
    return hexs

In [36]:
address1='\x7f\x00\x00\x01'
my_ntoa (address1)

'127.0.0.1'

In [38]:
address2="\xa3\x19r\x01"
my_ntoa (address2)

'163.25.114.1'

In [37]:
print('\x16\x03\x01\x01\x1e\x01')




In [38]:
address2="'\x16\x03\x01\x01\x1e\x01'"
my_ntoa (address2,' ')

'3922 3 1 1 30 1 39'

PNG, the Iamge Format
---
<img src="imgs/logo-bro.png" />

PNG is one of popular iamge format. Its code is like the following list,
<pre style="color:brown">
<89>PNG^M
^Z
^@^@^@^MIHDR^@^@^@¯^@^@^@H^H^F^@^@^@<80>Ó¥O^@^@^@       pHYs^@^@.#^@^@.#^Ax¥?v^@^@
OiCCPPhotoshop ICC profile^@^@xÚ<9d>SgTSé^V=÷ÞôBK<88><80><94>KoR^U^H RB<8b><80>^T<91>&*!        ^PJ<88>!¡Ù^UQÁ^QEE^D^[È <88>^C<8e><8e><80><8c>^UQ,^L<8a>
Ø^Gä!¢<8e><83>£<88><8a>Êûá{£kÖ¼÷æÍþµ×>ç¬ó<9d>³Ï^GÀ^H^L<96>H3Q5<80>^L©B^^^Qà<83>ÇÄÆáä.@<81>
$p^@^P^H³d!sý#^A^@ø~<<+"À^G¾^@^AxÓ^K^H^@ÀM<9b>À0^\<87>ÿ^OêB<99>\^A<80><84>^AÀt<91>8K^H<80>^T^@@z<8e>B¦^@@F^A<80><9d><98>&S^@ ^D^@`Ëcbã^@P-^@`'^?æÓ^@<80><9d>ø<99>{^A^@[<94>!^U^A <91>^@ ^Se<88>D^@h;^@¬ÏV<8a>E^@X0^@^TfKÄ9^@Ø-^@0IWfH^@°·^@ÀÎ^P^K²^@^H^L^@0Q<88><85>)^@^D{^@`È##x^@<84><99>^@^TFòW<ñ+®^Pç*^@^@x<99>²<¹$9E<81>[^H-q^GWW.^^(ÎI^W+^T6a^Ba<9a>@.Ây<99>^Y2<81>4^OàóÌ^@^@ <91>^U^Qà<83>óýxÎ^N®ÎÎ6<8e>¶^N_-ê¿^Fÿ"bbãþåÏ«p@^@^@át~Ñþ,/³^Z<80>;^F<80>mþ¢%î^Dh^^K u÷<8b>f²^O@µ^@ éÚWópø~<<E¡<90>¹ÙÙåääØJÄB[aÊW}þgÂ_ÀWýlù~<ü÷õà¾â$<81>2]<81>G^DøàÂÌôL¥^\Ï<92>      <84>bÜæ<8f>Gü·^Kÿü^]Ó"ÄIb¹X*^TãQ^Rq<8e>D<9a><8c>ó2¥"<89>B<92>)Å%Òÿdâß,û^C>ß5^@°j>^A{<91>-¨]c^CöK'^PXtÀâ÷^@^@ò»oÁÔ(^H^C<80>h<83>áÏwÿï?ýG %^@<80>fI<92>q^@^@^D$.TÊ³?Ç^H^@^@D <81>*°A^[ôÁ^X,À^F^\Á^EÜÁ^Kü`6<84>B$ÄÂB^PB
d<80>^\r`)¬<82>B(<86>Í°^]*`/Ô@^]4ÀQh<86><93>p^N.ÂU¸^N=p^Oúa^H<9e>Á(¼<81>        ^DAÈ^H^Sa!Ú<88>^Ab<8a>X#<8e>^H^W<99><85>ø!ÁH^D^R<8b>$ É<88>^TQ"K<91>5H1R<8a>T UH^]ò=r^B9<87>\Fº<91>;È^@2<82>ü<86>¼G1<94><81>²Q=Ô^LµC¹¨7^Z<84>F¢^KÐdt1<9a><8f>^V <9b>Ðr´^Z=<8c>6¡çÐ«h^OÚ<8f>>CÇ0Àè^X^G3Äl0.ÆÃB±8,        <93>cË±"¬^L«Æ^Z°V¬^C»<89>õcÏ±w^D^R<81>EÀ        6^DwB a^^AHXLXNØH¨ ^\$4^QÚ      7       ^C<84>QÂ'"<93>¨K´&º^QùÄ^Xb21<87>XH,#Ö^R<8f>^S/^P{<88>CÄ7$^R<89>C2'¹<90>^BI±¤TÒ^RÒFÒnR#é,©<9b>4H^Z#<93>ÉÚdk²^G9<94>, +È<85>ä<9d>äÃä3ä^[ä!ò[
<9d>b@q¤øSâ(RÊjJ^Yå^På4å^Fe<98>2AU£<9a>RÝ¨¡T^Q5<8f>ZB­¡¶R¯Q<87>¨^S4u<9a>9Í<83>^VIK¥­¢<95>Ó^Zh^Wh÷i¯ètº^QÝ<95>^^N<97>ÐWÒËéGè<97>è^Côw^L^M<86>^U<83>Ç<88>g(^Y<9b>^X^G^Xg^Yw^X¯<98>L¦^YÓ<8b>^YÇT071ë<98>ç<99>^O<99>oUX*¶*|^U<91>Ê
<95>J<95>&<95>^[*/T©ª¦ªÞª^KUóUËT<8f>©^S}®FU3Sã© Ô<96>«Uª<9d>PëS^[Sg©;¨<87>ªg¨oT?¤~Yý<89>^FYÃLÃOC¤Q ±_ã¼Æ ^Kc^Y³x,!k^M«<86>u<81>5Ä&±ÍÙ|v*»<98>ý^]»<8b>=ª©¡9C3J3W³Ró<94>f?^Gã<98>qø<9c>tN ç(§<97>ó~<8a>Þ^Tï)â)^[¦4L¹1e\kª<96><97><96>X«H«Q«Gë½6®í§<9d>¦½E»Yû<81>^NAÇJ'\'Gg<8f>Î^E<9d>çSÙSÝ§
</pre>

It could be opened by centain applications except editor applications! 

Print first 24 bytes:
1. open the image file, read  and close it,
-  get its specification about its shape, (from its head) 
-  calculate its size.

In [40]:
# read the binary file
imgfile = open('imgs/logo-bro.png', 'rb')
bdata = imgfile.read()
imgfile.close()

# calculate its size in K, 1K is equal to 1024 bytes
size=int(len(bdata))/1024
print("Size of image file: %3.1f k" %size)

Size of image file: 10.6 k


In [105]:
print(bdata[:25],len(bdata[:25]))

b'\x89PNG\r\n\x1a\n\x00\x00\x00\rIHDR\x00\x00\x00\xaf\x00\x00\x00H\x08' 25


In [99]:
bdata[25]

6

The first 25 bytes,
<code>
<font color="red">\x89 P N G \r \n \x1a \n</font> \x00 \x00 \x00 \r I H D R <font color="green">\x00 \x00 \x00 \xaf</font><font color="brown"> \x00 \x00 \x00 H</font> \x08
</code>

In [82]:
ord('A'),ord('Z'),ord('\n'),ord('\r')

(65, 90, 10, 13)

In [95]:
for i in range(25):
    if (bdata[i]) in range(65,91):
        print(chr(bdata[i]),end=' ')
    elif (bdata[i]==10):  
        print('\\n',end=' ')
    elif (bdata[i]==13):  
        print('\\r',end=' ')    
    else:
        print(''.join(hex(bdata[i])),end=' ')

0x89 P N G \r \n 0x1a \n 0x0 0x0 0x0 \r I H D R 0x0 0x0 0x0 0xaf 0x0 0x0 0x0 H 0x8 

The PNG format specification stipulates that the width and height are stored within the first 24 bytes. Using Python standard library, <b>struct</b>, we can convert binary data to and from Python data structures.

In [30]:
import struct

In [50]:
valid_png_header = b'\x89PNG\r\n\x1a\n'

if bdata[:8] == valid_png_header:
    # width is extracted from bytes 16-20, and height from bytes 21-24
    width, height = struct.unpack('>LL', bdata[16:24])
    print('Valid PNG, width', width, 'height', height)
else:
    print('Not a valid PNG')    

Valid PNG, width 175 height 72


Note about struc module
---
<pre>
Specifier Byte order
<         little endian 
>         big endian

Specifier   Description             Bytes
L           unsigned long integer   4 
</pre>

Here, <big><font color="brown">&gt;LL</font></big>, means to covert data,  ```bdata[16:24]```, into 2 ```"4-byte unsigned long integer"```'s (from left to right).

In [109]:
# width size and height size
print(bdata[16:20],bdata[20:24])

b'\x00\x00\x00\xaf' b'\x00\x00\x00H'


In [28]:
0xaf

175

In [69]:
# convert 2 4-byte data, [16:20] and [20:24], to Python data
struct.unpack('>2L', bdata[16:24])

(175, 72)

In [64]:
# convert Python data to bytes
struct.pack('>L', 175)

b'\x00\x00\x00\xaf'

In [31]:
# convert Python data to bytes
struct.pack('>L', 72)

b'\x00\x00\x00H'

In [32]:
# convert Python data to bytes
struct.pack('<L', 72)

b'H\x00\x00\x00'

Question
---
1. As above
<pre>
In:  bdata[20:24]
Out: b'\x00\x00\x00H'
</pre>
Could we input
<code>
 H
</code>
to get the size of height?
- The size of width, (height) of PNG image is in ```L, unsigned long Integer``` format.  How big could the dimensions of PNG be?

In [2]:
# read the binary file
imgfile = open('imgs/20161121002864.jpg', 'rb')
bdata = imgfile.read()
imgfile.close()
size=int(len(bdata))/1024
print("Size of image file: %3.1f k" %size)

Size of image file: 41.0 k


In [127]:
print(bdata[16:24])

b'\x00`\x00\x00\xff\xdb\x00C'


In [50]:
valid_png_header = b'\x89PNG\r\n\x1a\n'

if bdata[:8] == valid_png_header:
    # width is extracted from bytes 16-20, and height from bytes 21-24
    width, height = struct.unpack('>LL', bdata[16:24])
    print('Valid PNG, width', width, 'height', height)
else:
    print('Not a valid PNG')    

Valid PNG, width 175 height 72


2. String Manipulating
---

Dictionary Array
---
Python supports the dictionary-type data format as follows: 
<code style="color:brown;">
array_data={'key1':,val1, 'key2':,val2, ...}
array_data['key1']=val1
</code>

In [1]:
print('nice'.ljust(10, '.') + '150000'.rjust(10,'□'))

nice......□□□□150000


In [None]:
def printPicnic(itemsDict, leftWidth, rightWidth):
    print('PICNIC ITEMS'.center(leftWidth + rightWidth, '-'))
    for k, v in itemsDict.items():
        print(k.ljust(leftWidth, '.') + str(v).rjust(rightWidth))
picnicItems = {'sandwiches': 4, 'apples': 12, 'cups': 4, 'cookies': 8000}

In [None]:
printPicnic(picnicItems, 12, 5)

In [None]:
printPicnic(picnicItems, 20, 6)

3. Matching Multiple Groups with the Pipe
===

Say you want to  find a phone number from Taipei in a string.

Pattern Exmple: <code style="color:brown">02-211-8800</code>

1. length: 9
- area code check: 02
-  -, the  first hyphen, after the area code
- three more numeric characters,
- -, another hyphen,
- and  finally four more numbers

In [None]:
def isPhoneNumber(text):
    print('Whether is %s a phone code in Teipei?' %text)
    if len(text) != 11:
        return False
    for i in range(0, 2):
        if not text[i].isdecimal():
            return False
    if text[2] != '-':
        return False
    for i in range(3, 6):
        if not text[i].isdecimal():
            return False
    if text[6] != '-':
        return False
    for i in range(7, 11):
        if not text[i].isdecimal():
            return False
    return True

In [None]:
#print('087-222222 is a phone number:')
print(isPhoneNumber('087-222222'))

In [None]:
isPhoneNumber('02-211-8800')

Find out from a messages
---
1. test for consective strings with length being 11.

In [None]:
def isPhoneNumberv2(text):
    if len(text) != 11:
        return False
    for i in range(0, 2):
        if not text[i].isdecimal():
            return False
    if text[2] != '-':
        return False
    for i in range(3, 6):
        if not text[i].isdecimal():
            return False
    if text[6] != '-':
        return False
    for i in range(7, 11):
        if not text[i].isdecimal():
            return False
    return True

In [None]:
message = 'The representive number of campus number is 02-211-8800 and 02-211-8700 is of scanner.'
k=1
for i in range(len(message)): 
    chunk = message[i:i+11]
    if isPhoneNumberv2(chunk):
       print(k,' Phone number found: ' + chunk)
       k=k+1  
print('o-- Done and ',k-1, ' phone numbers found')

Finding Patterns of text with regular expressions
---
Seem to work well. However, the stings, '022118800', couldn't be recognized out:


In [None]:
print(isPhoneNumber('022118800'))

<code style="color:brown">re</code> module
---
Professional regular expressions module, <code style="color:brown">re</code>, in python does help to such works:

1. Import the regex module with `import re`.
2. Create a Regex object with the `re.compile()` function. (Remember to use araw string.)
3. Pass the string you want to search into the Regex object’s `search()` method. This returns a Match object.
4. Call the Match object’s `group()` method to return a string of the actual matched text.

**Fact**

xx-xxx-xxxx is as same as xx xxx xxxx

**Formula**

<code style="background-color:#99eeee">rule=re.compile(...)</code><br>
<code style="background-color:#99ffff">rule.search(string)</code>

In [1]:
import re
phoneNumRegex1 = re.compile(r'(\d\d)-(\d\d\d-\d\d\d\d)')
mo = phoneNumRegex1.search('CGU number is 02-211-8800.')
mo.group()

'02-211-8800'

In [2]:
phoneNumRegex2 = re.compile(r'(\(\d\d\))-(\d\d\d-\d\d\d\d)')
mo = phoneNumRegex2.search('CGU tel number is (02)-211-8800')
mo.group()

'(02)-211-8800'

**motifier, (...)?**

<code>(</code><code style="color:brown">\\(</code><code>)?, (</code><code style="color:brown">\\)</code><code>)?</code>: part of the regular expression means that the pattern, `(` or   `)` is an optional group. 

**Note** that the backslash symbol, \, used here, acclaims that "(" and ")" are charcters  but not part of function.

In [3]:
phoneNumRegex3 = re.compile(r'((\()?\d\d(\))?)-(\d\d\d-\d\d\d\d)')
mo = phoneNumRegex3.search('CGU tel number is 02-211-8800')
mo.group()

'02-211-8800'

In [4]:
mo = phoneNumRegex3.search('CGU tel number is (02)-211-8800')
mo.group()

'(02)-211-8800'

Other motifiers
---
- `(...)*`, Matching Zero or More with the Star,
- `(...)+`, Matching One or More with the Plus
- `(...){2,3}`, repeat a specific number of 2 or 3 times here,
- `^, $`, ( called caret or dollar), at beginning or end text.
- `.`, (called dot) wild character
- `|`, or

the findall() method
---

fiind out all the satisfied contents, 

In [5]:
mo = phoneNumRegex3.search('CGU tel number is 02-211-8800 and trnsfer number is (02)-211-8700')
mo.group()

'02-211-8800'

In [6]:
mo = phoneNumRegex3.findall('CGU tel number is 02-211-8800 and trnsfer number is (02)-211-8700')
mo,mo[1],

([('02', '', '', '211-8800'), ('(02)', '(', ')', '211-8700')],
 ('(02)', '(', ')', '211-8700'))

In [7]:
print(mo[1][0],'-',mo[1][3])

(02) - 211-8700


In [17]:
print("There are %d found" %len(mo))
i=1
for item in mo:
    print("%s)." %i,item[0],'-',item[3])
    i=i+1

There are 2 found
1). 02 - 211-8800
2). (02) - 211-8700


Character Classes
---
```
\d Any numeric digit from 0 to 9.
\D Any character that is not a numeric digit from 0 to 9.
\w Any letter, numeric digit, or the underscore character. 
\W Any character that is not a letter, numeric digit, or the underscore character
\s Any space, tab, or newline character.
\S Any character that is not a space, tab, or newline.

```

In [None]:
import pyperclip, re

In [None]:
phoneRegex = re.compile(r'''(
    (\d{3}|\(\d{3}\))?                # area code, 2 digits
    (\s|-|\.)?                        # separator
    (\d{3})                           # first 3 digits
    (\s|-|\.)                         # separator
    (\d{4})                           # last 4 digits
    (\s*(ext|x|ext.)\s*(\d{2,5}))?    # extension
    )''', re.VERBOSE)

In [None]:
text='CGU tel number is 02-211-8800 and trnsfer number is (02)-211-8700'
matches = []
for groups in phoneRegex.findall(text):
    phoneNum = '-'.join([groups[1], groups[3], groups[5]])
    if groups[8] != '':
        phoneNum += ' x' + groups[8]
    matches.append(phoneNum)

In [None]:
if len(matches) > 0:

In [None]:
# Copy results to the clipboard.
if len(matches) > 0:
    pyperclip.copy('\n'.join(matches)) # get one string at once
    print('Copied to clipboard:')
    print('\n'.join(matches))          # get all strings
else:
    print('No phone numbers or email addresses found.')

Passwords Design
---
Recently, app providers always ask customs or users to strengthen theirs passwords for security. But how does the authentication work?

In [18]:
import string

In [None]:
string.ascii_uppercase 

In [None]:
string.ascii_lowercase 

Requirement
---
no less than length 8, includes at least a). 1 digits, b). one upeercase alphbet, c). one lower alphbet, and d). one punctuation. 

In [19]:
def is_strong_password(s):
    # define 8-digit strengthened passwords
    lenth_regex = re.compile(r'.{8,}')
    upper_regex = re.compile(r'[ABCDEFGHIJKLMNOPQRSTUVWXYZ]')
    lower_regex = re.compile(r'[abcdefghijklmnopqrstuvwxyz]')
    digit_regex = re.compile(r'[0123456789]')
    punctuation_regex = re.compile(r'[!"#$%&\'()*+,-./:;<=>?@[\]^_`{|}~]')
    if lenth_regex.search(s) and \
        upper_regex.search(s) and \
        lower_regex.search(s) and \
        digit_regex.search(s) and \
        punctuation_regex.search(s):
            return 'Well done!'
    return False

In [20]:
passwd='Uv123450'
is_strong_password(passwd)

False

In [None]:
passwd='0Us#dwq_'
is_strong_password(passwd)