## File
### systems and path
A file in python has two properties: *tilename* and *path*. Path separator in windows "\\".

In [1]:
# the os package
import os

- Using *os.path.join()* fuction to resemble path component with path separator.

In [2]:
print(os.path.join("usr","bin","spam"))

usr\bin\spam


In [4]:
myFiles = ["account.txt", "details.csv", "invite.docx"]
for filename in myFiles:
    print(os.path.join('\home', filename))

\home\account.txt
\home\details.csv
\home\invite.docx


- Get the current working directory *os.getcwd*; change the current directory: *os.chdir*

In [18]:
print(os.getcwd())
os.chdir("C:\\Users\\fuwei\\Dropbox")
print(os.getcwd())
os.chdir("C://Users//fuwei//Dropbox/Python Learning Code")
print(os.getcwd())

C:\Users\fuwei\Dropbox\Python Learning Code
C:\Users\fuwei\Dropbox
C:\Users\fuwei\Dropbox\Python Learning Code


- Creat New folders using *os.makedirs()*
- Convert a **relative path** into a **absolute one**, using the *os.path.abspath()*
- *os.path.isabs(path)* test if the path is a absolute path
- Convert a **absolute path** into a **relative path**, using the *os.path.relpath(path start)*

In [20]:
print(os.path.relpath("C:\\windows", "C:\\"))
print(os.path.relpath("C:\\windows", "C:\\spam"))

windows
..\windows


In [22]:
path = "C:\\windows\\system32\\cal.exe"
print(path)
print(os.path.dirname(path))
print(os.path.basename(path))
print(os.path.split(path))
print(os.path.sep)

C:\windows\system32\cal.exe
C:\windows\system32
cal.exe
('C:\\windows\\system32', 'cal.exe')
\


- Access to the file in the folder
1. List all the files: *os.listdir(path)*
2. Get the size information *os.path.getsize(path)*


### reading and writing files:
- call **open()** function to return a File object;
- call **read()** or **write()** method on the File object;
- call **close()** function to close the File object.

A file object has methods like:
- **filename.read()**: read the entire file content as a larger string, return a string.
- **filename.readlines()**: return a list each element of which is a line in the file.
- **filename.write()**: writing any content to the file.

In [47]:
myfile = open("demo.txt", "w")
myfile.write('Hello World!\n')
myfile.close

<function TextIOWrapper.close()>

In [48]:
myfile = open("demo.txt", "a")
myfile.write('Goodbye World\n')
myfile.close

<function TextIOWrapper.close()>

In [49]:
myfile = open("demo.txt", "r")
c = myfile.readlines()
myfile.close

<function TextIOWrapper.close()>

In [60]:
print(c)

[]


### Save variables/data using the **shelve" package.

In [51]:
import shelve

In [54]:
mydata = shelve.open("mydata")
cat = ["Zophie", "Pooka", "Simon"]
mydata["cats"] = cat
mydata.close

<bound method Shelf.close of <shelve.DbfilenameShelf object at 0x00000242A1C61278>>

In [59]:
mydata = shelve.open("mydata")
print(mydata["cats"])  ## it is like saving using a dictionary.
mydata.close

['Zophie', 'Pooka', 'Simon']


<bound method Shelf.close of <shelve.DbfilenameShelf object at 0x00000242A1BC6C18>>

## Regular Expressions

**Example:** Write a function to identify the phone number.

In [83]:
def isPhoneNumber(text):
    # criterion-1: 12 digits
    if len(text) != 12:
        return False
    for i in range(0,3):
        if not text[i].isdecimal():
            return False
    if text[3] != "-":
        return False
    for i in range(4,7):
        if not text[i].isdecimal():
            return False
    if text[7] != "-":
        return False
    for i in range(8,12):
        if not text[i].isdecimal():
            return False 
    return True


In [86]:
print("324-342-2323 is a phone number:" )
print(isPhoneNumber("324-342-2323"))
print("This is a {} phone number".format(isPhoneNumber("hello world")))
message = "Call me at 415-555-1011 tomorrow, 415-555-9999 is my office."
for i in range(len(message)):
    chunk = message[i:i+12]
    if isPhoneNumber(chunk):
        print(chunk)


324-342-2323 is a phone number:
True
This is a False phone number
415-555-1011
415-555-9999


### Using **regular expression** to resovle the above question.
- Find the patterns of the text with regular expression.
- Package *RE* helps with the regular expression. 
    1. First, represent the pattern via a string
    2. Then, put the string of pattern into the *re.complie()* to create a **RE** object
    3. Then search the text using the **regular expression** object as a template, *regex.search(str)* which returns a match object
    4. Call the match object *search* method to reveil the elements found
        (1)*search* method returns the first match
        (2)*findall* method returns all the matches in a list.

In [106]:
import re
print("Step-1: Create an Regex object")
phoneNumberRegex = re.compile(r"(\d{3})-(\d{3}-\d{4})")
print(type(phoneNumberRegex))

print("\n Step-2: Search")
result = phoneNumberRegex.search(message)
print(result.groups())
result_all = phoneNumberRegex.findall(message)
print(result_all)

Step-1: Create an Regex object
<class 're.Pattern'>

 Step-2: Search
('415', '555-1011')
[('415', '555-1011'), ('415', '555-9999')]


### More on regular expression
- pipe search: "or"
- optional match: ? flags the groups that precede it as optional part of the pattern.
- match zero or more: * the groups that precede it.
- match one or more: + the groups that precede it.
- match with repetitions: {num}

In [126]:
print("EXAMPLE of Pipe search:")
batRegex = re.compile(r"Bat(man|mobile|copter|bat)")
mo = batRegex.search("Batmobile lost a wheel")
print(mo.group())
print(mo.group(0))
print(mo.group(1))
print(mo.groups())

print("\n EXAMPLE of Optional Match:")
batRegex = re.compile(r"Bat(wo)?man")
mo = batRegex.search("The Adventures of Batman")
print(mo.group())

print("\n EXAMPLE of Match Zero or More:")
batRegex = re.compile(r"Bat(wo)*man")
mo = batRegex.search("The Adventures of Batwowoman")
print(mo.group())

print("\n EXAMPLE of Match One or More:")
batRegex = re.compile(r"Bat(wo)+man")
mo = batRegex.search("The Adventures of Batwowowoman")
print(mo.group())

print("\n EXAMPLE of Repetitive Match:")
batRegex = re.compile(r"(ha){3}")
mo = batRegex.search("hahahahaha")
print(mo.group())
mo = batRegex.search("ha")
print("mo is none? {}".format(mo is None))

EXAMPLE of Pipe search:
Batmobile
Batmobile
mobile
('mobile',)

 EXAMPLE of Optional Match:
Batman

 EXAMPLE of Match Zero or More:
Batwowoman

 EXAMPLE of Match One or More:
Batwowowoman

 EXAMPLE of Repetitive Match:
hahaha
mo is none? True


### Character Class
- \d: numerical digits 0-9
- \D: any character rather than the numerical digits 0-9
- \w: any letter, numerical digit, and underscore character, (word)
- \W: any character that is not a letter, numerical digit or the underscore character
- \s: any space, tab, or newline character (space)
- \S: any character that is not a space, newline or tab
- []: used to customize own search class, specify a range of numerical digit, letter.
- ^, \$, match from the start or end.
- .: dot, represent a character except a newline character.
- .*: match everything.

In [130]:
vowelRegex = re.compile(r"[aeiouAEIOU]")
mo = vowelRegex.search("RoboCop eats baby food. BABY FOOD.")
print(mo.group())
mo = vowelRegex.findall("RoboCop eats baby food. BABY FOOD.")
print(mo)

o
['o', 'o', 'o', 'e', 'a', 'a', 'o', 'o', 'A', 'O', 'O']


In [132]:
endwithNumber = re.compile(r"\d$")
print(endwithNumber.search("Your number is 42").group())

2


In [137]:
atRegex = re.compile(r".at")
print(atRegex.findall("The cat in the hat."))

['cat', 'hat']


#### Combine re options: 
- re.INGNORECASE: ignore the uppercase and lowercase in the match
- re.DOTALL: dot . reprent any single character including the newline character
- re.VERBOSE: get the permission for writing the comments in the regular expression.

### Substituting Strings with the sub() Method
- in the first argument of **sub()**, \1, \2, \3 mean "Enter the text of the group 1,2,3 in the substitutions

In [136]:
print("EXAMPLE of Sub():")
namesRegex =re.compile(r"Agent \w+")
print(namesRegex.sub("CENCORED", "Agent Alice gave the secret documents to Agenet Bob."))

print("EXAMPLE of Sub() with \1:")
namesRegex =re.compile(r"Agent (\w)\w+")
print(namesRegex.sub(r"\1****", "Agent Alice gave the secret documents to Agenet Bob."))

EXAMPLE of Sub():
CENCORED gave the secret documents to Agenet Bob.
EXAMPLE of Sub() with :
A**** gave the secret documents to Agenet Bob.
