Python

Table of Contents (Click me)

General Notes
Python Environment Notes
Specific version features
Python Language Notes
Add to google colab
requests, urllib(3), & http modules
Concurrent, asynchronous, multiprocessing
Performance / memory considerations
itertools
Misc.
Do specific things

Colab Notebooks

Links

General resource websites:

Look into

funcy: https://funcy.readthedocs.io/en/stable/overview.html
async python and databases article
(var,)
pydantic
dataclasses: link
@property: Programiz link
- Tech with Tim video
Python classes / OOP: Programiz link
- getters and setters
- @abstractmethod
contextlib.suppress()
collections.defaultdict
Partial functions: tech w/ tim video
__init__.py: tech w/ tim video
fire ducks (panda replacement?)

General Notes

To make FIPS compliant, from hashlib use sha256() instead of md5()

Python Environment Notes

pip

Install pip: link
PyPI is the central repository of software that pip uses to download from
pip installs packages, which are usually bunded into 'wheels' (.whl) prior to installation
A wheel is a zip-style archive that contains all the files necessary for a typical package installation
It may be best to run it as python3 -m pip instead of using the system-installed pip
- Why you should use python -m pip
- This uses executes pip using the Python interpreter you specified as python
- This is beneficial so you know what pip version is being used if you have multiple python versions installed

Tips

See package dependencies: pipdeptree
- Install it first with pip install pipdeptree
Loop through n first elements in list link: for item in itertools.islice(my_list, n)

Wheel files (.whl)

What are python wheels

Virtual environments (venv)

Complete guide to python virtual environments
Virtual environments are used when you want to be more explicit and limit packages to a specific project
You should never install stuff into your global Python interpreter when you develop locally
To create an environment: python -m venv <venv name>

Python Language Notes

Variables and parameter passing

Deep dive into variables in Python
Python uses pass-by-object-reference for function parameter passing
- Any changes to the actual object held in the parameter will change the original variable
- Any reassignment will not be reflected in the original variable
When passing a variable of a mutable object (list, dict, set, classes, etc.), make sure that you pass a copy if you don't want the original variable object to be modified by any changes in the function

Duck typing

Duck typing is a programming concept where the suitability of an object is determined by the presence of certain methods and properties, rather than the object's actual type
Instead of checking an object's type explicitly, code assumes that if it "quacks like a duck" (i.e., has the necessary behavior), it can be used in the desired context
Key points:
- Behavior over type: The focus is on what an object can do, not what class it belongs to.
- Flexibility: Allows functions to operate on any object that implements the expected interface, promoting reusable and adaptable code.
- Python's dynamic nature: Python commonly uses duck typing, enabling developers to write generic functions that work with various objects as long as they provide the required attributes or methods

Ex:

def quack_and_walk(duck):
    duck.quack()
    duck.walk()

class Duck:
    def quack(self):
        print('Quack!')

class Person:
    def quack(self):
        print('I can quack like a duck!')

# Both objects work with quack because they implement quack()
quack(Duck())
quack(Person())

Dunder Methods (aka Magic Methods)

Dunder methods are methods that allow instances of a class to interact with the built-in functions and operators of the language

`**` (un)packing

Use of double asterisk (**) in python

Exceptions; try / except

Raising / handling custom exception:

class ExampleException(Exception):
  pass
...
try: 
  if !var:
    raise ExampleException
  else:
    // do work
except Exception as err: 
  if type(err) == ExampleException:
    // handle exception
finally:
  // always runs regardless (even if the other blocks have a return statement)
  print('inside')

The finally block always runs regardless, even if the other blocks have a return statement
- If the finally block has a return also, it overwrites the return from the other block

Emptiness / None check

if not nums is preferred and quicker than if len(nums) == 0
If need to explicitly check for None, need to do if nums is None

Add to google colab:

String methods: .startswith() and .endswith() instead of string slicing
- List of string methods
Create dictionary from two diffferent sequences / collections: dict(zip(list1, list2))
Update multiple key/value pairs in a dticitonary: d1.update(key1=val1, key2=val2) or d1.update({'key1': 'val1', 'key2': 'val2'})
- Ex: class_grades.update(algebra='A', english='B')

Dictionaries:

Generators:

Generators (Real Python)

Classes / OOP

To access the base class methods and attributes, you can use super()

Ex: calls both display_info() functions

class Polygon:
    def __init__(self, sides):
        self.sides = sides

    def display_info(self):
        print("A polygon is a two dimensional shape with straight lines.")

class Triangle(Polygon):
    def display_info(self):
        print("A triangle is a polygon with 3 edges.")
        super().display_info() # call the display_info() method of Polygon

If you have an __init__() constructor in the child class, you need to call super() inside it so that it initializes the attributes from the parent class

Ex: calls both display_info() functions

class Person:
    def __init__(self, name):
        self.name = name

class Student(Person):
    def __init__(self, student_id):
        self.student_id = student_id
        super().__init__()  # instantiate the Person attributes

@staticmethod decorator defines a class method as static. It doesn't take a self parameter

Python Code Organization: Classes, Files, Modules, Packages, Libraries, Frameworks, and Imports

modules ⊆ packages ⊆ libraries ⊆ frameworks

Python Files (`.py`)

A Python file (.py) contains Python code (functions, variables, classes, etc.)
It can be executed directly or imported as a module

Modules (single file)

A module is simply a Python file (.py) that can be imported into another Python file
Modules help organize and reuse code
Standard modules (like math, os) are built-in

Packages (directory with modules)

A package is a directory containing multiple modules and an __init__.py file (optional in Python 3.3+)
Helps organize large projects

Example Package Structure:

  my_project/
  │── main.py
  │── my_package/
  │   │── __init__.py
  │   │── module1.py
  │   │── module2.py

Libraries (Collection of Modules & Packages)

A collection of modules or packages that provide reusable functionality
Examples: requests, numpy, flask

Frameworks (Structured Library for a Purpose)

A structured collection of libraries and conventions that help build applications
Examples: Django (web development), Flask (web framework), PyTorch (machine learning)

requests, urllib(3), & http modules

Newer / higher-level: `urllib3` vs `requests`

urllib3 and requests

Feature	`urllib3`	`requests`
Level of Control	High (Low-level, customizable)	Medium (High-level, abstracted API)
Ease of Use	Easier than `urllib`, more setup than `requests`	Very easy, user-friendly
Connection Pooling	Automatic, efficient pooling	Automatic, abstracted
Performance	Lightweight, efficient for many requests	Slightly heavier due to extra features
Retries and Timeouts	Customizable, easy to configure	Built-in, simplified
Code Readability	More readable than `urllib`, but still verbose	Highly readable, concise
Community/Documentation	Moderate community, adequate docs	Large community, extensive documentation
Dependencies	Few dependencies	More dependencies (heavier library)
Handling JSON and Sessions	Basic handling	Built-in, user-friendly

`requests`

requests Session to keep state
For most use cases, requests is the best choice. It is:
- User-friendly: Easy to use with a simple API
- Readable: Produces clean, concise, and maintainable code
- Feature-rich: Includes built-in support for sessions, cookies, retries, and JSON handling
- Community: Has extensive documentation and a large user base

`urllib3`

placeholder

Older / lower-level: `urllib` vs `http(.client)`

Useful when you need to avoid third-party dependencies or require very low-level control

Comparison table(Click me)

Feature	`urllib`	`http.client`
Level of Control	High (Low-level API, manual setup)	Very High (Raw HTTP control)
Ease of Use	Complex, verbose	More complex, highly verbose
Connection Pooling	No built-in pooling	Manual connection handling
Performance	Lightweight but verbose	Lightweight, but requires manual setup
Retries and Timeouts	Manual setup	Manual setup
Code Readability	Verbose, not user-friendly	Highly verbose, least readable
Community/Documentation	Limited (older, standard lib)	Limited (low-level library)
Dependencies	No external dependencies (built-in)	No external dependencies (built-in)
Handling JSON and Sessions	Manual handling	Manual handling

`http`

http module usage

Concurrent, asynchronous, multiprocessing

concurrent.futures allows for easy integration of async functionality for certain parts of a mostly synchronous program

Performance / memory optimization considerations

Sets (O(1)) have faster lookup times than lists (O(n))
List comprehensions / generator expressions are typically faster than filter() / map() / reduce() combinations
Generator expressions are preferred over list comprehensions when possible
- Generator expressions produce values on-the-fly and are more memory-efficient and typically faster than list comprehensions, as it avoids creating an intermediate list
- However, generator expressions can be slower than list comprehensions for small datasets due to the overhead of creating the iterator
if not my_list is ~2x faster than if len(my_list) == 0

`itertools`

itertools tutorial / documentation

Note: the operator module is used in some examples, but it is not necessary when using itertools

accumulate(): makes an iterator that returns the results of a function

itertools.accumulate(iterable[, func])

Passing a function

data = [1, 2, 3, 4, 5]
result = itertools.accumulate(data, operator.mul)
print(list(result))   # [1, 2, 6, 24, 120]

Without passing a function (defaults to summation)

result = itertools.accumulate(data)
print(list(result))   # [1, 3, 6, 10, 15]

combinations(): takes an iterable and a integer. This will create all the unique combination that have r members
- itertools.combinations(iterable, r)
```
shapes = ['circle', 'triangle', 'square',]
result = itertools.combinations(shapes, 2)
print(list(result))   # [1, 2, 6, 24, 120]
```
count(): makes an iterator that returns evenly spaced values starting with number start
- Similar to range(), but works for an infinite sequence (and is more memory efficient?)
- itertools.count(start=0, step=1)
```
for i in itertools.count(10,3):
    print(i)
    if i > 20:
        break
# 10, 13, 16, 19, 22  (as individual lines)
```

cycle(): cycles through an iterator endlessly

itertools.cycle(iterable)

colors = ['red', 'orange', 'yellow', 'green']
for color in itertools.cycle(colors):
    print(color)
# red, orange, yellow, green, red, orange, ...  (as individual lines)

chain(): cycles through an iterator endlessly

itertools.cycle(iterable)

colors = ['red', 'orange', 'yellow', 'green']
for color in itertools.cycle(colors):
    print(color)
# red, orange, yellow, green, red, orange, ...  (as individual lines)

islice():
- Similar to index slicing ([:x]), but is more memory-efficient and can handle infinite and non-indexable iterables
- itertools.islice(iterable, start, stop[, step])
```
colors = ['red', 'orange', 'yellow', 'green']
for color in itertools.islice(colors, 2):
    print(color)
# red, orange (as individual lines)
```

permutations():

itertools.permutations(iterable, r=None)

alpha_data = ['a', 'b', 'c']
result = itertools.permutations(alpha_data)
list(result)  # [('a', 'b', 'c'), ('a', 'c', 'b'), ('b', 'a', 'c'), ('b', 'c', 'a'), ('c', 'a', 'b'), ('c', 'b', 'a')]

product(): creates the Cartesian products from a series of iterables.

itertools.permutations(iterable, r=None)

num_data = [1, 2, 3]
alpha_data = ['a', 'b', 'c']
result = itertools.product(num_data, alpha_data)
list(result)  # [(1, 'a'), (1, 'b'), (1, 'c'), (2, 'a'), (2, 'b'), (2, 'c'), (3, 'a'), (3, 'b'), (3, 'c')]

Misc.

dis.dis(): see what assembly calls are made for a python code snippet
timeit.timeit() see how long a specific code snippet takes to run
mypy package: does static type checking
pprint: https://docs.python.org/3/library/pprint.html

Version features

3.8

Assignment Expressions (walrus operator) (link): can use := in an expression in a while loop or if statement to assign and evaluate it
- Ex: if (y := 2) > 1: # sets y = 2 and evaluates the expression as 2 > 1
- Ex: while (user_input := input("Enter text: ")) != "stop": # keeps getting user input until "stop" is entered
- Can also use it in list comprehensions: [result for i in range(5) if (result := func(i)) == True]
  - It is more efficient because it potentially only makes half the func() calls compared to [func(i) for i in range(5) if func(i) == True]
f-string improvements: Now supports the = specifier for debugging (f"{var=}")

3.9

Dictionary union operators (| and |=):
- d1 | d2 results in new dictionary resulting from the union of d1 and d2
- Can use |= to do an in-place (update) union: d1 |= d2 // will make d1 equal to the resulting union
.removeprefix() and .removesuffix(): methods to simplify removing prefixes and suffixes from strings

3.10

Pattern matching (match and case) statements:
- Introduces a match statement similar to switch-case, allowing pattern matching
- Example
```
match command:
case 'start':
    start_process()
case 'stop':
    stop_process()
```
Parenthesized Context Managers:
Allows using multiple context managers more neatly.
Example: with (open('file1') as f1, open('file2') as f2):

3.11

Significant Performance Improvements:
- Python 3.11 includes performance improvements, claiming to be around 10-60% faster than Python 3.10
Exception Groups (ExceptionGroup) and except*:
- Allows raising and handling multiple exceptions simultaneously.
- except* is used to handle ExceptionGroup objects
  - This allows you to handle multiple exceptions raised together, enabling you to catch subsets of exceptions more precisely
- Example:
```
try:
  raise ExceptionGroup("Multiple Errors", [ValueError("Invalid value"), TypeError("Type mismatch")])
except* ValueError as e:
  print(f"Caught ValueError: {e}")
except* TypeError as e:
  print(f"Caught TypeError: {e}")
```

taskgroups in asyncio:

Easier way to manage groups of asynchronous tasks.

Example:

async with asyncio.TaskGroup() as tg:
  tg.create_task(some_coroutine())

3.12

Enhanced async and await:
- Improvements in asyncio and asynchronous task handling for better performance and simpler code patterns

Do specific things:

Debug print:

DEBUG == True
def print_debug(*args, **kwargs):
    if DEBUG == True:
        print(' '.join(map(str,args)), **kwargs, flush=True)

Print traceback of error after catching an Exception: traceback.format_exc()

Check if variable or attribute exists, without causing an error if it doesn't:

if 'myVar' in locals():
    # myVar exists in local scope
if 'myVar' in globals():
    # myVar exists in global scope
if hasattr(obj, 'attr_name'):
    # obj.attr_name exists

Convert string to json, only if it exists (isn't empty)
- data = json.loads(body) if (body := event.get("body")) else body
Clean use of ternary for dictionary value:
- email = os.environ.get('stg_email' if 'stg' in env else 'prod_email')

Concise and efficient way to get a value based on a specific input (similar to case statement):

type_ = query.get('type')
type_num = {'a': 1, 'b': 2, 'c': 3}.get(type_, 0)   # Provide a default value (e.g., 0)

Build dictionary from list of keys:

def get_data_values(data):   # Dict[str, Any] -> Dict[str, Any]
    keys = ['a', 'b', 'c', 'd']
    return {
        'type': 'data',
        'id': data.get('data_id', ''),
        **{key: data.get(key, '') for key in keys}
    }

See list of installed and built-in modules: print(help('modules'))
Check if object is an instance of a specific class(es)
- isinstance(var, str): returns true if var is a string object
- isinstance(var, (str, int)): returns true if var is a string or an int object

Python

Table of Contents (Click me)

Colab Notebooks

Links

General resource websites:

Look into

General Notes

Python Environment Notes

pip

Tips

Wheel files (.whl)

Virtual environments (venv)

Python Language Notes

Variables and parameter passing

Duck typing

Dunder Methods (aka Magic Methods)

** (un)packing

Exceptions; try / except

Emptiness / None check

Add to google colab:

Dictionaries:

Generators:

Classes / OOP

Python Code Organization: Classes, Files, Modules, Packages, Libraries, Frameworks, and Imports

Python Files (.py)

Modules (single file)

Packages (directory with modules)

Libraries (Collection of Modules & Packages)

Frameworks (Structured Library for a Purpose)

requests, urllib(3), & http modules

Newer / higher-level: urllib3 vs requests

requests

urllib3

Older / lower-level: urllib vs http(.client)

Comparison table(Click me)

http

Concurrent, asynchronous, multiprocessing

Performance / memory optimization considerations

itertools

Misc.

Version features

3.8

3.9

3.10

3.11

3.12

Do specific things:

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

`**` (un)packing

Python Files (`.py`)

Newer / higher-level: `urllib3` vs `requests`

`requests`

`urllib3`

Older / lower-level: `urllib` vs `http(.client)`

`http`

`itertools`