# Module 14: Advanced Python Modules

## Collections Module

**`collections` Module in Python**

The `collections` module provides alternatives to built-in types that are optimized for specific tasks, such as handling unordered data, counting elements, or working with ordered collections.

- **`Counter`**: A subclass of `dict` that counts the occurrences of elements in an iterable.
- **`defaultdict`**: A subclass of `dict` that returns a default value if the key is not found.
- **`namedtuple`**: Factory function for creating tuple subclasses with named fields.
- **`OrderedDict`**: A dictionary that remembers the order in which items were inserted.
- **`deque`**: A list-like container that supports fast appends and pops from both ends.

These data structures help simplify common programming tasks and improve performance for specific use cases.


In [1]:
from collections import Counter

In [2]:
mylist = [1,1,1,2,2,2,2,2,3,3,3,3,3,3,3,4,4,4,5,5,5]
Counter(mylist)

Counter({3: 7, 2: 5, 1: 3, 4: 3, 5: 3})

In [3]:
mylist = ['a','a',10,10,10]
Counter(mylist)

Counter({10: 3, 'a': 2})

In [4]:
Counter('aaaabbbcccccddef')

Counter({'c': 5, 'a': 4, 'b': 3, 'd': 2, 'e': 1, 'f': 1})

In [5]:
sentence = 'How many times does each word show up in this sentence with a word'

Counter(sentence.split())

Counter({'word': 2,
         'How': 1,
         'many': 1,
         'times': 1,
         'does': 1,
         'each': 1,
         'show': 1,
         'up': 1,
         'in': 1,
         'this': 1,
         'sentence': 1,
         'with': 1,
         'a': 1})

In [6]:
letters = 'aaabbbbccccccdddddedfffffff'

In [7]:
c = Counter(letters)

In [8]:
c

Counter({'f': 7, 'c': 6, 'd': 6, 'b': 4, 'a': 3, 'e': 1})

In [9]:
c.most_common()

[('f', 7), ('c', 6), ('d', 6), ('b', 4), ('a', 3), ('e', 1)]

In [10]:
c.most_common(2)

[('f', 7), ('c', 6)]

In [11]:
list(c)

['a', 'b', 'c', 'd', 'e', 'f']

In [12]:
from collections import defaultdict

In [13]:
d = {'a': 10}

In [14]:
d

{'a': 10}

In [15]:
d['a']

10

In [16]:
d['WRONG']

KeyError: 'WRONG'

In [17]:
d = defaultdict(lambda: 0)

In [18]:
d['correct'] = 100

In [19]:
d['correct']

100

In [20]:
d["Wrong"]

0

In [21]:
mytuple = (10,20,30)

In [22]:
mytuple[0]

10

In [23]:
from collections import namedtuple

In [24]:
Dog = namedtuple('Dog',['age','breed','name'])

In [25]:
sammy = Dog(age=5,breed='Husky',name='Sammy')

In [26]:
sammy

Dog(age=5, breed='Husky', name='Sammy')

In [27]:
type(sammy)

__main__.Dog

In [28]:
sammy.age

5

In [29]:
sammy.breed

'Husky'

In [30]:
sammy.name

'Sammy'

In [31]:
sammy[0]

5

## OS Module

**`os` Module in Python**

The `os` module provides a way to interact with the operating system and perform tasks like file manipulation, process management, and environment management.

- **File and Directory Operations**: Functions like `os.mkdir()`, `os.remove()`, and `os.rename()` to work with files and directories.
- **Environment Variables**: Use `os.getenv()` and `os.environ` to access and modify environment variables.
- **Path Operations**: Functions like `os.path.join()`, `os.path.exists()`, and `os.path.abspath()` help manipulate and query file paths.
- **Process Management**: Use `os.system()` to run system commands and `os.getpid()` to get the current process ID.

The `os` module is essential for working with the underlying operating system from Python code.


In [32]:
pwd

'c:\\Users\\MeetRadadiya\\training-crest\\Python'

In [33]:
f = open('practice.txt', 'w+')
f.write('This is a test string')
f.close()

In [34]:
import os

In [35]:
os.getcwd()

'c:\\Users\\MeetRadadiya\\training-crest\\Python'

In [36]:
os.listdir()

['.ipynb_checkpoints',
 '1-DataTypes_Basics.ipynb',
 '10-Milestone Project 2.ipynb',
 '11-Milestone Project 2(Black Jack).ipynb',
 '12-Decorators.ipynb',
 '13-Generators.ipynb',
 '14-Advanced Python Modules.ipynb',
 '2-DataStructures.ipynb',
 '3-FileIO.ipynb',
 '4-Statementes and Comparison Operators.ipynb',
 '5-Methods in Python.ipynb',
 '6-Milestone Project.ipynb',
 '7-OOP.ipynb',
 '8-Modules and Packages.ipynb',
 '9-Errors and Exceptions Handling.ipynb',
 'Assessments',
 'practice.txt',
 'pylint.py']

In [37]:
import shutil

In [None]:
shutil.move('practice.txt','C:\Users\Demo\')

In [38]:
import send2trash

In [39]:
send2trash.send2trash('practice.txt')

In [40]:
for folder, subf_folders,files in os.walk(os.getcwd()):
    print(f'Currently looking at {folder}')
    print('\n')
    print('The subfolders are:')
    for sub_fold in subf_folders:
        print(f'\t Subfolder: {sub_fold}')
    
    print('\n')
    print('The files are:')
    for f in files:
        print(f'\t File: {f}')
    print('\n')

Currently looking at c:\Users\MeetRadadiya\training-crest\Python


The subfolders are:
	 Subfolder: .ipynb_checkpoints
	 Subfolder: Assessments


The files are:
	 File: 1-DataTypes_Basics.ipynb
	 File: 10-Milestone Project 2.ipynb
	 File: 11-Milestone Project 2(Black Jack).ipynb
	 File: 12-Decorators.ipynb
	 File: 13-Generators.ipynb
	 File: 14-Advanced Python Modules.ipynb
	 File: 2-DataStructures.ipynb
	 File: 3-FileIO.ipynb
	 File: 4-Statementes and Comparison Operators.ipynb
	 File: 5-Methods in Python.ipynb
	 File: 6-Milestone Project.ipynb
	 File: 7-OOP.ipynb
	 File: 8-Modules and Packages.ipynb
	 File: 9-Errors and Exceptions Handling.ipynb
	 File: pylint.py


Currently looking at c:\Users\MeetRadadiya\training-crest\Python\.ipynb_checkpoints


The subfolders are:


The files are:
	 File: 4-Statementes and Comparison Operators-checkpoint.ipynb
	 File: DataStructures-checkpoint.ipynb
	 File: DataTypes_Basics-checkpoint.ipynb
	 File: FileIO-checkpoint.ipynb
	 File: Untitled-checkp

## Date Time Module

**`datetime` Module in Python**

The `datetime` module provides classes for manipulating dates and times in both simple and complex ways.

- **`datetime.datetime`**: Represents a single point in time, combining both date and time.
- **`datetime.date`**: Represents just the date (year, month, day).
- **`datetime.time`**: Represents just the time (hour, minute, second, microsecond).
- **`datetime.timedelta`**: Represents the difference between two `datetime` objects (duration).
- **`datetime.strftime()`**: Formats `datetime` objects into strings.
- **`datetime.strptime()`**: Parses strings into `datetime` objects.

The `datetime` module is useful for working with time-based data, performing date arithmetic, and formatting dates for display.


In [41]:
import datetime

In [42]:
mytime = datetime.time(2)

In [43]:
mytime.minute

0

In [44]:
mytime.hour

2

In [45]:
print(mytime)

02:00:00


In [46]:
mytime.microsecond

0

In [47]:
type(mytime)

datetime.time

In [48]:
today = datetime.date.today()

In [49]:
print(today)

2025-07-10


In [50]:
today.ctime()

'Thu Jul 10 00:00:00 2025'

In [51]:
from datetime import datetime

mydatetime = datetime(2021,10,3,14,20,1)


In [52]:
print(mydatetime)

2021-10-03 14:20:01


In [53]:
mydatetime = mydatetime.replace(year=2020)

In [54]:
print(mydatetime)

2020-10-03 14:20:01


In [55]:
from datetime import date

In [56]:
date1 = date(2021,11,3)
date2 = date(2020,11,3)

result = date1 - date2

In [57]:
result.days

365

In [58]:
datetime1 = datetime(2021,11,3,22,0)
datetime2 = datetime(2020,11,3,12,0)

In [59]:
result = datetime1 - datetime2

In [60]:
result

datetime.timedelta(days=365, seconds=36000)

In [61]:
result.seconds

36000

In [62]:
result.total_seconds()

31572000.0

## Math Module

**`math` Module in Python**

The `math` module provides mathematical functions and constants that help with basic mathematical operations.

- **Mathematical Constants**: Constants like `math.pi` (π), `math.e` (Euler’s number).
- **Basic Operations**: Functions like `math.sqrt()`, `math.pow()`, `math.fsum()`.
- **Trigonometry**: Functions like `math.sin()`, `math.cos()`, `math.tan()`.
- **Logarithms**: Functions like `math.log()`, `math.log10()`, `math.exp()`.
- **Factorial and Combinations**: Functions like `math.factorial()`, `math.comb()`, `math.perm()`.

The `math` module is optimized for performing precise mathematical calculations and is widely used in scientific computing and data analysis.


In [63]:
import math

In [64]:
value = 4.35

In [65]:
print(math.floor(value))

4


In [66]:
print(math.ceil(value))

5


In [67]:
print(round(value))

4


In [68]:
print(math.pi)

3.141592653589793


In [69]:
print(math.e)

2.718281828459045


In [70]:
print(math.inf)

inf


In [71]:
print(math.nan)

nan


In [72]:
print(math.log(math.e))

1.0


In [73]:
print(math.log(100,10))

2.0


In [74]:
print(math.sin(10))

-0.5440211108893698


In [75]:
print(math.degrees(math.pi/2))

90.0


In [76]:
print(math.radians(180))

3.141592653589793


## Random Module

**`random` Module in Python**

The `random` module provides functions for generating random numbers and performing random operations.

- **`random.random()`**: Returns a random float between 0 and 1.
- **`random.randint(a, b)`**: Returns a random integer between `a` and `b` (inclusive).
- **`random.choice(sequence)`**: Selects a random element from a non-empty sequence.
- **`random.shuffle(list)`**: Randomly reorders the elements of a list in place.
- **`random.sample(population, k)`**: Returns a list of `k` unique elements chosen from the population.

The `random` module is commonly used in simulations, games, and testing where random values are needed.


In [77]:
import random

In [78]:
random.randint(0,100)

87

In [79]:
mylist = list(range(0,20))

In [80]:
random.choice(mylist)

4

In [81]:
random.choices(population=mylist,k=10)

[5, 11, 4, 9, 1, 2, 18, 8, 8, 3]

In [82]:
random.choices(population=mylist,k=10,weights=[0.5,0.1,0.1,0.1,0.1,0.1,1,1,1,1,1,1,1,1,1,1,1,1,1,1])

[6, 11, 17, 10, 9, 14, 14, 6, 16, 11]

In [83]:
random.shuffle(mylist)

In [84]:
mylist

[18, 1, 12, 6, 19, 2, 5, 13, 11, 8, 4, 16, 0, 10, 17, 7, 9, 15, 3, 14]

In [85]:
random.uniform(a=0,b=100)

14.69896536133717

In [86]:
random.gauss(mu=0,sigma=1)

0.9819420511659149

## Python Debugger

**Debugger in Python**

The Python debugger, `pdb`, allows you to interactively trace and debug your Python programs. It helps to inspect variables, step through code, and diagnose errors during execution.

- **Start Debugger**: Use `import pdb; pdb.set_trace()` to pause execution at a specific point and enter the debugger.
- **Common Commands**:
  - `n`: Execute the next line of code.
  - `s`: Step into the function call.
  - `c`: Continue execution until the next breakpoint.
  - `p <expression>`: Print the value of an expression.
  - `q`: Quit the debugger.
  
The `pdb` module is useful for inspecting runtime behavior and fixing bugs efficiently.


In [87]:
x = [1,2,3]
y = 2
z = 3

result1 = y + z
result2 = x + y + z
result3 = x + result1
result4 = result1 + result2 + result3

TypeError: can only concatenate list (not "int") to list

In [88]:
import pdb

In [89]:
x = [1,2,3]
y = 2
z = 3

result1 = y + z
pdb.set_trace()
result2 = x + y + z
result3 = x + result1
result4 = result1 + result2 + result3

> [1;32mc:\users\meetradadiya\appdata\local\temp\ipykernel_17132\139234087.py[0m(6)[0;36m<module>[1;34m()[0m

[1, 2, 3]
2
*** NameError: name 'result2' is not defined


## Regular Expressions(RegEX)

**Regular Expressions in Python**

Regular expressions (regex) are patterns used to match, search, and manipulate text. In Python, they are supported by the `re` module.

- **Pattern Matching**: Use functions like `re.search()` and `re.match()` to find patterns in strings.
- **Finding All Matches**: `re.findall()` returns all non-overlapping matches in a string.
- **Substitution**: `re.sub()` replaces occurrences of a pattern with a replacement string.
- **Compilation**: `re.compile()` compiles a pattern for reuse and better performance.

Regular expressions are powerful tools for validating input, extracting data, and transforming text.


In [17]:
text = "The agent's phone number is 408-555-1234. Call soon!"

In [18]:
'phone' in text

True

In [19]:
import re

In [20]:
pattern = 'phone'

In [21]:
re.search(pattern,text)

<re.Match object; span=(12, 17), match='phone'>

In [22]:
pattern = 'NOT IN TEXT'

In [23]:
re.search(pattern, text)

In [24]:
pattern = 'phone'

In [25]:
match = re.search(pattern, text)

In [26]:
match

<re.Match object; span=(12, 17), match='phone'>

In [27]:
match.span()

(12, 17)

In [28]:
match.start()

12

In [29]:
match.end()

17

In [30]:
text = 'my phone once, my phone twice'

In [31]:
match = re.search('phone', text)

In [32]:
match

<re.Match object; span=(3, 8), match='phone'>

In [33]:
matches = re.findall('phone',text)

In [34]:
matches

['phone', 'phone']

In [36]:
len(matches)

2

In [38]:
for match in re.finditer('phone',text):
    print(match.group())

phone
phone


**Character Identifiers in Regular Expressions**

Character identifiers are special sequences in regex that help define patterns for matching characters:

- **`.`**: Matches any character except a newline.
- **`\d`**: Matches any digit (0–9).
- **`\D`**: Matches any non-digit character.
- **`\w`**: Matches any alphanumeric character and underscore.
- **`\W`**: Matches any non-alphanumeric character.
- **`\s`**: Matches any whitespace character (space, tab, newline).
- **`\S`**: Matches any non-whitespace character.
- **`^`**: Matches the start of a string.
- **`$`**: Matches the end of a string.

These identifiers make it easier to build complex search patterns.


In [40]:
text = 'My phone number is 408-555-1234'

In [41]:
phone = re.search("408-555-1234", text)

In [42]:
phone

<re.Match object; span=(19, 31), match='408-555-1234'>

In [43]:
phone = re.search(r'\d\d\d-\d\d\d-\d\d\d\d',text)

In [44]:
phone

<re.Match object; span=(19, 31), match='408-555-1234'>

In [45]:
phone.group()

'408-555-1234'

**Quantifiers in Regular Expressions**

Quantifiers define how many times a character or group can repeat in a pattern:

- **`*`**: Matches 0 or more occurrences.
- **`+`**: Matches 1 or more occurrences.
- **`?`**: Matches 0 or 1 occurrence.
- **`{n}`**: Matches exactly `n` occurrences.
- **`{n,}`**: Matches `n` or more occurrences.
- **`{n,m}`**: Matches between `n` and `m` occurrences.

Quantifiers help control the amount of text a pattern should match.


In [46]:
phone = re.search(r'\d{3}-\d{3}-\d{4}',text)

In [47]:
phone

<re.Match object; span=(19, 31), match='408-555-1234'>

In [48]:
phone.group()

'408-555-1234'

In [49]:
phone_pattern = re.compile(r'(\d{3})-(\d{3})-(\d{4})')

In [50]:
results = re.search(phone_pattern, text)

In [51]:
results.group()

'408-555-1234'

In [52]:
results.group(1)

'408'

In [53]:
results.group(2)

'555'

In [54]:
results.group(3)

'1234'

In [None]:
results.group(4) #This will give an error as 4th group doesn't exist

IndexError: no such group

### Additional RegEx Syntax

In [None]:
re.search(r'cat','The cat is here')

<re.Match object; span=(4, 7), match='cat'>

In [58]:
re.search(r'cat','The dog is here')

In [59]:
re.search(r'cat|dog','The cat is here')  ## If you have to search for Cat or Dog.

<re.Match object; span=(4, 7), match='cat'>

In [60]:
re.findall(r'at','The cat in the hat sat there.')

['at', 'at', 'at']

**Wildcard in Regular Expressions**

The wildcard character in regex is:

- **`.` (dot)**: Matches **any single character** except a newline (`\n`).

It is often used to represent unknown or variable characters in a pattern. To match a literal dot, use `\.` instead.

The wildcard is useful when the exact character is not known or can vary.


In [61]:
re.findall(r'.at','The cat in the hat sat there.')

['cat', 'hat', 'sat']

In [64]:
re.findall(r'...at','The cat in the hat went splat.')

['e cat', 'e hat', 'splat']

- **`^`**: Matches the start of a string.

In [65]:
re.findall(r'^\d','1 is a number')

['1']

- **`$`**: Matches the end of a string.

In [66]:
re.findall(r'\d$','The number is 2')

['2']

In [67]:
phrase = 'there are 3 numbers 34 inside 5 this sentence'

In [72]:
pattern = r'[^\d]+'

In [73]:
re.findall(pattern,phrase)

['there are ', ' numbers ', ' inside ', ' this sentence']

In [74]:
test_phrase = 'This is a string! But it has punctuation. How can we remove it?'

In [76]:
clean = re.findall(r'[^!.? ]+',test_phrase)

In [77]:
clean

['This',
 'is',
 'a',
 'string',
 'But',
 'it',
 'has',
 'punctuation',
 'How',
 'can',
 'we',
 'remove',
 'it']

In [78]:
' '.join(clean)

'This is a string But it has punctuation How can we remove it'

In [79]:
text = 'Only find the hypen-words in this sentence. But you do not know hoe long-ish they are'

In [84]:
pattern = r'\w+-\w+'

In [85]:
re.findall(pattern,text)

['hypen-words', 'long-ish']

In [86]:
text = 'Hello, would you like some catfish?'
texttwo = 'Hello, would you like to take a catnap?'
textthree = 'Hello, have you seen this caterpiller?'

In [87]:
re.search(r'cat(fish|nap|claw)',text)

<re.Match object; span=(27, 34), match='catfish'>

In [88]:
re.search(r'cat(fish|nap|claw)',texttwo)

<re.Match object; span=(32, 38), match='catnap'>

In [90]:
re.search(r'cat(fish|nap|claw)',textthree) 

## Timing your code

**Timing Your Code in Python**

Timing code helps measure how long a block of code takes to run. Python provides several ways to do this:

- **`time` module**: Use `time.time()` to get the current time in seconds.
- **`timeit` module**: Provides a more precise way to measure small code snippets by running them multiple times.
- **`perf_counter()`**: From the `time` module, gives the highest available resolution timer.

These tools are useful for benchmarking and optimizing performance.


In [1]:
def func_one(n):
    return [str(num) for num in range(n)]

In [2]:
func_one(10)

['0', '1', '2', '3', '4', '5', '6', '7', '8', '9']

In [3]:
def func_two(n):
    return list(map(str,range(n)))

In [4]:
func_two(10)

['0', '1', '2', '3', '4', '5', '6', '7', '8', '9']

In [5]:
import time

In [9]:
#Current time before
start_time = time.time()

#RUN CODE
result = func_one(1000000)

#Current time after runing Code
end_time = time.time()

#Elapsed time
elapsed_time = end_time - start_time

print(elapsed_time)

0.30882692337036133


In [10]:
#Current time before
start_time = time.time()

#RUN CODE
result = func_two(1000000)

#Current time after runing Code
end_time = time.time()

#Elapsed time
elapsed_time = end_time - start_time

print(elapsed_time)

0.25287318229675293


In [12]:
import timeit

In [13]:
stmt = '''func_one(100)'''

In [14]:
setup = '''
def func_one(n):
    return [str(num) for num in range(n)]
'''

In [17]:
timeit.timeit(stmt,setup,number=100000)

2.456511500000488

In [18]:
stmt = '''
func_two(100)
'''

In [19]:
setup = '''
def func_two(n):
    return list(map(str,range(n)))
'''

In [20]:
timeit.timeit(stmt,setup,number=100000)

1.9077610000094865

In [21]:
%%timeit
func_one(100)

30.5 μs ± 10.7 μs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)


In [22]:
%%timeit
func_two(100)

19.2 μs ± 7.87 μs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)


## Zipping and Unzipping files in Python

In [None]:
f = open('fileone.txt','w+')
f.write('One File')
f.close()

In [None]:
f = open('filetwo.txt','w+')
f.write('Two File')
f.close()

In [23]:
import zipfile

In [None]:
comp_file = zipfile.ZipFile('comp_file.zip','w')

In [None]:
comp_file.write('fileone.txt',compress_type=zipfile.ZIP_DEFLATED)

In [None]:
comp_file.write('filetwo.txt',compress_type=zipfile.ZIP_DEFLATED)

In [None]:
comp_file.close()

In [None]:
zip_obj = zipfile.ZipFile('comp_file.zip','r')

In [None]:
zip_obj.extractall("extracted_content")

In [24]:
import shutil

In [None]:
dir_to_zip = 'Demo Path'

In [None]:
output_filename = 'example'

In [None]:
shutil.make_archive(output_filename,'zip',dir_to_zip)

In [None]:
shutil.unpack_archive('example.zip','final_unzip','zip')