# Advanced Modules

* [Collections](#collections)
* [OS module](#os_module)
* [Datetime](#datetime)
* [Math](#math)
* [Random](#random)
* [Python Debugger](#debugger)
* [Regular Expressions](#reg_express)
* [Timeit](#timeit)
* [Unzipping and Zipping Modules](#zip_unzip)

<a id='collections'></a>
## Collections<br>
Implements specialized container datatypes (e.g. dictionaries, tuples)


In [None]:
from collections import Counter

In [None]:
mylist = [1,1,2,2,2,3,3,3,3,4,4,5,6,6,7,8,8,8]

In [None]:
Counter(mylist)

In [None]:
newlist = ['a','a','b','b','b',5,5,10,11,11,11]

In [None]:
Counter(newlist)

Counter is a specialized dictionary

In [None]:
Counter('aaaabbbbbbeeeerrrrffffff')

In [None]:
sentence = "Words on words on sentences on phrases"

In [None]:
Counter(sentence.lower().split())

In [None]:
letters = "sdkajhsldkfjhsa"

In [None]:
c = Counter(letters)

In [None]:
c

In [None]:
c.most_common()

In [None]:
list(c)

In [None]:
from collections import defaultdict

In [None]:
d = {'a': 10}

In [None]:
d['a']

In [None]:
d["WRONG"]

In [None]:
dd = defaultdict(lambda: 0)

In [None]:
dd['yes']

In [None]:
dd['no']

In [None]:
d

In [None]:
dd

In [None]:
mytuple = (10,20,30)

In [None]:
from collections import namedtuple

In [None]:
Dog = namedtuple("Dog", ['age','breed','name'])

In [None]:
Darla = Dog(age=7,breed='Lab',name="Darla")

In [None]:
Darla

<a id='os_module'></a>
## OS Module

In [None]:
pwd

In [None]:
f = open('practice.txt','w+')

In [None]:
f.write("This is a test string")
f.close()

In [None]:
import os

In [None]:
os.getcwd()

In [None]:
os.listdir()

In [None]:
os.listdir('C:\\Users\\leigh\\udemy\\python\\Complete-Python-3-Bootcamp-master')

In [None]:
import shutil

In [None]:
shutil.move('practice.txt','C:\\Users\\leigh\\udemy\\java')

In [None]:
os.listdir('C:\\Users\\leigh\\udemy\\java')

### Deleting files
There are 3 methods for deleting files with the os module:
* os.unlink(path) deletes a file at the provided path
* os.rmdir(path) deletes a folder (if the folder is empty) at the provided path
* os.rmtree(path) deletes ALL FILES AND FOLDERS at the provided path

<br>
The above methods are permanent and cannot be undone<br>The send2trash module sends files to the trash bin and can, thus, be reversed

In [None]:
import send2trash

In [None]:
shutil.move('C:\\Users\\leigh\\udemy\\java\\practice.txt',os.getcwd())

In [None]:
send2trash.send2trash('practice.txt')

In [None]:
pwd

In [None]:
os.listdir()

In [16]:
file_path = 'C:\\Users\\leigh\\udemy\\python\\Complete-Python-3-Bootcamp-master\\12-Advanced Python Modules\\Example_Top_Level'

In [17]:
for folder, sub_folders, files in os.walk(file_path):
    print(f'Currently looking at{folder}')
    print('\n')
    print('The subfolders are: ')
    for sub_fold in sub_folders:
        print(f'\t Subfolder: {sub_fold}')
        
    print('\n')
    print("The files are: ")
    
    for file in files:
        print(f'\t File: {file}')
    print('\n')

Currently looking atC:\Users\leigh\udemy\python\Complete-Python-3-Bootcamp-master\12-Advanced Python Modules\Example_Top_Level


The subfolders are: 
	 Subfolder: Mid-Example-One


The files are: 
	 File: Mid-Example.txt


Currently looking atC:\Users\leigh\udemy\python\Complete-Python-3-Bootcamp-master\12-Advanced Python Modules\Example_Top_Level\Mid-Example-One


The subfolders are: 
	 Subfolder: Bottom-Level-One
	 Subfolder: Bottom-Level-Two


The files are: 
	 File: Mid-Level-Doc.txt


Currently looking atC:\Users\leigh\udemy\python\Complete-Python-3-Bootcamp-master\12-Advanced Python Modules\Example_Top_Level\Mid-Example-One\Bottom-Level-One


The subfolders are: 


The files are: 
	 File: One_Text.txt


Currently looking atC:\Users\leigh\udemy\python\Complete-Python-3-Bootcamp-master\12-Advanced Python Modules\Example_Top_Level\Mid-Example-One\Bottom-Level-Two


The subfolders are: 


The files are: 
	 File: Bottom-Text-Two.txt




<a id='datetime'></a>
## Datetime<br>
Create objects with information about dates, times, timezones, time durations, etc

In [None]:
import datetime

In [None]:
mytime = datetime.time(15,20,1,20)

In [None]:
mytime.minute

In [None]:
print(mytime)

In [None]:
today = datetime.date.today()

In [None]:
print(today)

In [None]:
today.ctime()

In [None]:
from datetime import datetime

In [None]:
mydatetime = datetime(today.year,today.month,today.day,17,40)

In [None]:
print(mydatetime)

In [None]:
yesterday = mydatetime.replace(day=(today.day-1))

In [None]:
print(yesterday)

In [None]:
from datetime import date

In [None]:
date1 = date(2023,11,3)
date2 = date(2022,1,3)

In [None]:
time_diff = date1 - date2

In [None]:
type(time_diff)

In [None]:
time_diff.days

In [None]:
datetime1 = datetime(2023,11,3,16,20)
datetime2 = datetime(2022,1,3,4,20)

In [None]:
timediff = datetime2 - datetime1

In [None]:
43200/60/60

In [None]:
timediff.total_seconds()

# Math and Random

<a id='math'></a>
## Math

In [None]:
import math

In [None]:
help(math)

In [None]:
value = 4.35

In [None]:
math.floor(value)

In [None]:
math.ceil(value)

In [None]:
round(value)

In [None]:
round(4.5)

In [None]:
round(5.5)

round() follows a rule of specifically returning even or odd numbers when values are split down the middle<br>
This helps keep averages from being too low or high from always rounding in one direction

In [None]:
math.pi

In [None]:
math.e

In [None]:
math.inf

In [None]:
# not a number
math.nan

The Numpy library is used for numeric processing and is much deeper and more advanced than math

In [None]:
math.log(math.e)

In [None]:
math.log(100,10)

In [None]:
math.sin(10)

In [None]:
math.degrees(math.pi/2)

<a id='random'></a>
## Random

In [None]:
import random

In [None]:
random.randint(1,100)

In [None]:
random.seed(101)

random.randint(0,100)

In [None]:
random.randint(0,100)

In [None]:
print(random.randint(0,100))
print(random.randint(0,100))
print(random.randint(0,100))
print(random.randint(0,100))
print(random.randint(0,100))
print(random.randint(0,100))

In [None]:
mylist = list(range(0,20))

In [None]:
mylist

In [None]:
random.choice(mylist)

In [None]:
# sample with replacement
random.choices(population=mylist,k=10)

In [None]:
# sample without replacement
random.sample(population=mylist,k=10)

In [None]:
mylist

In [None]:
random.shuffle(mylist)

In [None]:
mylist

In [None]:
random.uniform(a=0,b=100)

In [None]:
random.gauss(mu=0,sigma=1)

<a id='debugger'></a>
## Python Debugger

In [None]:
x = [1,2,3]
y = 2
z = 3

result = y + z
result2 = x + y

In [None]:
import pdb

In [None]:
x = [1,2,3]
y = 2
z = 3

result_one = y + z

pdb.set_trace()
result_two = y + x

<a id='reg_express'></a>
## Regular Expressions<br>
parse through text to find general patterns (e.g. emails, phone numbers)<br>
found in the "re" library

regular expressions are strings with an r literal<br>
e.g.<br>
* Phone Number
    * (555)-555-5555
* Regex pattern
    * r"(\d\d\d)-\d\d\d-\d\d\d\d"

In this instance, d stands for "digit"<br>
Parentheses and dashes don't have identifiers<br>
The more efficient way to write this Regex pattern is:
* r"(\d{3})-\d{3}-\d{4}"

In [None]:
text = "The agent's phone number is 408-555-1234. Call soon!"

In [None]:
"phone" in text

In [5]:
import re

In [None]:
pattern = 'phone'

In [None]:
re.search(pattern,text)

In [None]:
pattern_one = 'not in text'

In [None]:
re.search(pattern_one,text)

In [None]:
match = re.search(pattern,text)

In [None]:
match

In [None]:
match.span()

In [None]:
match.end()

In [None]:
match.start()

In [None]:
match = re.search('phone',text)

In [None]:
match

In [9]:
text = "I have her phone, she has my phone"

In [10]:
matches = re.findall('phone',text)

In [11]:
matches

['phone', 'phone']

In [12]:
len(matches)

2

In [14]:
for match in re.finditer('phone',text):
    print(match)

<re.Match object; span=(11, 16), match='phone'>
<re.Match object; span=(29, 34), match='phone'>


### Character identifiers
\d digits (e.g. numbers)<br>
\w alphanumeric (e.g. letters, numbers, certain special characters)<br>
\s white space (e.g. spaces)<br>
\D non-digits (e.g. letters)<br>
\W non-alphanumeric (e.g. special characters)<br>
\S non-whitespace (e.g. no spaces)

In [6]:
text = 'My phone number is 408-555-1234'

In [7]:
phone = re.search(r"\W*\d{3}\W*-\d{3}-\d{4}",text)

In [8]:
phone

<re.Match object; span=(18, 31), match=' 408-555-1234'>

### Quantifiers
\+ occurs one or more times<br>
{x} occurs exactly x times<br>
{x,y} occurs x to y times<br>
{x,} occurs at least 3 times<br>
\* occurs zero or more times<br>
? occurs once or not at all

In [None]:
phone_pattern = re.compile(r'(\W*\d{3}\W*)-(\d{3})-(\d{4})')

In [None]:
results = re.search(phone_pattern,text)

In [None]:
results.group()

In [None]:
results.group(1)

### Additional Syntax and Grouping

In [None]:
# "or" syntax

re.search(r'cat | dog', 'The cat is here')

In [None]:
# wildcards grab any character connected to the rest of the string
re.findall(r'...at', "The cat in the hat went flat")
# character IDs can also be used

In [None]:
# the caron is used to find strings that start with a particular character
re.findall(r'^\d', '1 is a number')

In [None]:
# the dollar sign at the end searches for a character at the end of a string
re.findall(r'\d$', "My favorite number is 69")

In [None]:
# brackets are used to group characters in a search
phrase = "my favorite numbers are 69 and also 420 but 666 is cool too"
re.findall(r'[^\d]+', phrase)

In [None]:
# this is useful for removing punctuation from a sentence
test_phrase = 'This is a string! It has punctuation, so we need to remove it. How do we do that?'

In [None]:
clean = re.findall(r'[^!?,. ]+', test_phrase)
' '.join(clean)

In [None]:
# whereas the caron groups for exclusion, you can also group for inclusion
text = "Only find the hyphen-ated words in this sen-tence but you don't know how long-ish they are"
pattern = r'[\w]+-[\w]+'

In [None]:
re.findall(pattern,text)

In [None]:
text = "Hello, would you like some catfish?"
text_two = "Hello, would you like to take a catnap"
text_three = "Hello, would you like to see this caterpillar"

In [None]:
# parentheses are used to group multiple options for searches
re.search(r'cat(fish|nap|erpillar)', text_three)

<a id='timeit'></a>
## Timeit

In [None]:
def func_one(n):
    return [str(num) for num in range(n)]

In [None]:
func_one(10)

In [None]:
def func_two(n):
    return list(map(str,range(n)))

In [None]:
func_two(10)

In [None]:
import time

In [None]:
# grab current time before running code
start_time = time.time()
#run code
result = func_one(100000)
#current time after running code
end_time = time.time()
elapsed_time = end_time-start_time

print(elapsed_time)

In [None]:
import timeit

In [None]:
stmt = '''
func_one(100)
'''

In [None]:
setup = '''
def func_one(n):
    return [str(num) for num in range(n)]
'''

In [None]:
stmt_two = '''
func_two(100)
'''

In [None]:
setup_two= '''
def func_two(n):
    return list(map(str,range(n)))
'''

In [None]:
timeit.timeit(stmt,setup,number=100000)

In [None]:
timeit.timeit(stmt_two,setup_two,number=100000)

In [None]:
%%timeit
func_one(100)

In [None]:
%%timeit
func_two(100)

<a id='zip_unzip'></a>
## Zipping and Unzipping Files

In [None]:
f = open('fileone.txt','w+')
f.write("one file")
f.close()

In [None]:
f = open('filetwo.txt','w+')
f.write("two file")
f.close()

In [None]:
import zipfile

In [None]:
comp_file = zipfile.ZipFile('comp_file.zip','w')

In [None]:
comp_file.write('fileone.txt',compress_type=zipfile.ZIP_DEFLATED)

In [None]:
comp_file.write('filetwo.txt',compress_type=zipfile.ZIP_DEFLATED)

In [None]:
comp_file.close()

In [None]:
zip_obj = zipfile.ZipFile('comp_file.zip','r')

In [None]:
zip_obj.extractall('extracted_content')

In [None]:
import shutil

In [None]:
pwd

In [None]:
dir_zip = 'C:\\Users\\leigh\\udemy\\python\\extracted_content'

In [None]:
output_filename = 'example'

In [None]:
shutil.make_archive(output_filename,'zip',dir_zip)

In [None]:
shutil.unpack_archive('example.zip','unzip_dir','zip')