# 🐍 Advanced Python Survival Guide
**Audience:** Experienced Python developers  
**Duration:** ~2 hours (guided)  
**Last generated:** 2025-09-11 13:55

This notebook is a *textbook-like survival document*: read the markdown, run the code, and explore the nuances.


## Agenda
1. Subtleties of Core Data Structures (lists, sets, tuples, dictionaries)
2. Python Internals & “Oaths” (Zen, identity vs equality, mutability traps, GC)
3. Classic & Advanced Data Structures (BFS, DFS, Heap / Dijkstra)
4. Advanced Usage Patterns (itertools, generators/coroutines, context managers, memoization, typing)
5. Wrap-up + Further reading


## 1. Subtleties of Core Data Structures

### 1.1 Lists – dynamic arrays, memory behavior, and copying
Python lists are dynamic arrays. Appending is **amortized O(1)** thanks to overallocation.
We’ll also inspect **shallow vs deep copy**, and list comprehensions vs generator expressions.


In [None]:
import sys

nums = []
sizes = []
for i in range(16):
    nums.append(i)
    sizes.append(sys.getsizeof(nums))

list(zip(range(16), sizes))[:10]  # show first 10 (index, size in bytes)


In [None]:
import copy

a = [[1,2], [3,4]]
b = a[:]                 # shallow copy
c = copy.deepcopy(a)     # deep copy

a[0][0] = 99
print("a:", a)           # a changed
print("b (shallow):", b) # b reflects inner mutation
print("c (deep):", c)    # c unaffected


In [None]:
# List comprehension vs generator expression (memory implications)
squares_list = [x*x for x in range(10_000)]
squares_gen = (x*x for x in range(10_000))

import sys
print("List size (bytes):", sys.getsizeof(squares_list))
print("Generator size (bytes):", sys.getsizeof(squares_gen))

# Consume a little of the generator to show it works lazily
sum(next(squares_gen) for _ in range(5))  # sum of first 5 squares


### 1.2 Sets – hashing, unhashable elements, and deduplication
Sets are hash tables with average O(1) membership tests. Elements must be **hashable**.


In [None]:
s = set()
try:
    s.add([1,2])   # lists are unhashable (mutable)
except TypeError as e:
    print("Expected TypeError for list:", e)

s.add((1,2))       # tuples are hashable (when they contain only hashable items)
print("Set contents:", s)

emails = ["a@x.com", "b@y.com", "a@x.com"]
unique = set(emails)
print("Unique emails:", unique)


### 1.3 Tuples – immutability and containing mutables
Tuples are immutable, but they can **contain** mutable objects (which can be mutated).
Also compare `namedtuple` vs `dataclass` for lightweight records.


In [None]:
from collections import namedtuple
from dataclasses import dataclass

t = (1, [2,3])
t[1].append(4)
print("Tuple containing a list (mutated list):", t)

Point = namedtuple('Point', 'x y')
p1 = Point(10, 20)

@dataclass
class PointDC:
    x: int
    y: int
p2 = PointDC(10, 20)

print("namedtuple:", p1, p1.x, p1.y)
print("dataclass :", p2, p2.x, p2.y)


### 1.4 Dictionaries – ordered insertion, dynamic views, and collections helpers
Dictionaries preserve insertion order (3.7+). Views are **dynamic** and reflect updates.
`collections` offers `defaultdict`, `Counter`, and `ChainMap` for powerful patterns.


In [None]:
from collections import defaultdict, Counter, ChainMap

d = {"a": 1, "b": 2}
keys_view = d.keys()
d["c"] = 3
print("Keys view (dynamic):", list(keys_view))  # includes 'c'

dd = defaultdict(int)
dd["x"] += 1
print("defaultdict with int factory:", dict(dd))

c = Counter("abracadabra")
print("Counter most common 2:", c.most_common(2))

# ChainMap example: overlay config layers
defaults = {"timeout": 10, "retries": 2}
env = {"timeout": 20}
cmd = {"retries": 5}
cm = ChainMap(cmd, env, defaults)
print("ChainMap merged config:", dict(cm))


## 2. Python Internals & “Oaths”

- **Zen of Python**: guiding principles for writing idiomatic, readable code.
- **Identity vs Equality**: `is` compares object identity; `==` compares values.
- **Default mutable argument trap** and its canonical fix.
- **Reference counting & GC basics**.


In [None]:
import this  # prints Zen of Python

In [None]:
a = 1000
b = 1000
print("a is b:", a is b)   # likely False
print("a == b:", a == b)   # True


In [None]:
def add_item(item, bucket=[]):  # BAD: default is shared across calls
    bucket.append(item)
    return bucket

print(add_item(1))  # [1]
print(add_item(2))  # [1, 2]  # surprising!


In [None]:
def add_item(item, bucket=None):  # GOOD: create new list when None
    if bucket is None:
        bucket = []
    bucket.append(item)
    return bucket

print(add_item(1))
print(add_item(2))


In [None]:
import sys, gc

x = []
print("Reference count for x (approx):", sys.getrefcount(x))
print("GC thresholds:", gc.get_threshold())


## 3. Classic & Advanced Data Structures

We’ll implement **BFS** (queue), **DFS** (recursion/stack), and an **advanced priority-queue** example using `heapq` for **Dijkstra**.


In [None]:
from collections import deque

def bfs(graph, start):
    visited = set([start])
    q = deque([start])
    order = []
    while q:
        node = q.popleft()
        order.append(node)
        for nei in graph.get(node, []):
            if nei not in visited:
                visited.add(nei)
                q.append(nei)
    return order

graph_unweighted = {
    'A': {'B', 'C'},
    'B': {'A', 'D', 'E'},
    'C': {'A', 'F'},
    'D': {'B'},
    'E': {'B', 'F'},
    'F': {'C', 'E'}
}

print("BFS order from A:", bfs(graph_unweighted, 'A'))


In [None]:
def dfs_recursive(graph, node, visited=None, order=None):
    if visited is None:
        visited, order = set(), []
    visited.add(node)
    order.append(node)
    for nei in graph.get(node, []):
        if nei not in visited:
            dfs_recursive(graph, nei, visited, order)
    return order

print("DFS recursive from A:", dfs_recursive(graph_unweighted, 'A'))


In [None]:
import heapq

def dijkstra(graph_w, start):
    # graph_w: dict[node] -> list[(neighbor, weight)]
    dist = {n: float('inf') for n in graph_w}
    dist[start] = 0.0
    pq = [(0.0, start)]
    while pq:
        d, u = heapq.heappop(pq)
        if d > dist[u]:  # stale entry
            continue
        for v, w in graph_w[u]:
            nd = d + w
            if nd < dist[v]:
                dist[v] = nd
                heapq.heappush(pq, (nd, v))
    return dist

graph_weighted = {
    'A': [('B', 2), ('C', 5)],
    'B': [('A', 2), ('D', 1), ('E', 3)],
    'C': [('A', 5), ('F', 2)],
    'D': [('B', 1)],
    'E': [('B', 3), ('F', 1)],
    'F': [('C', 2), ('E', 1)]
}

print("Dijkstra distances from A:", dijkstra(graph_weighted, 'A'))


## 4. Advanced Usage Patterns

### 4.1 `itertools` power tools
- `permutations`, `combinations`, `accumulate`, `groupby`, `product`.


In [None]:
from itertools import permutations, combinations, accumulate, groupby, product

print("Permutations of [1,2,3] choose 2:", list(permutations([1,2,3], 2)))
print("Combinations of [1,2,3] choose 2:", list(combinations([1,2,3], 2)))
print("Accumulate [1,2,3,4]:", list(accumulate([1,2,3,4])))

data = sorted(["apple", "apricot", "banana", "blueberry", "cherry"], key=lambda x: x[0])
grouped = {k:list(g) for k,g in groupby(data, key=lambda x: x[0])}
print("Groupby by first letter:", grouped)

print("Product of [0,1] x ['a','b']:", list(product([0,1], ['a','b'])))


### 4.2 Generators & coroutines
Generators are lazy: they yield values one by one. Use `yield from` to delegate.


In [None]:
def count_up_to(n):
    i = 1
    while i <= n:
        yield i
        i += 1

def first_n_squares(n):
    for i in range(1, n+1):
        yield i*i

def squares_via_delegate(n):
    # delegate to another generator with 'yield from'
    yield from first_n_squares(n)

print("First 5 via generator:", list(count_up_to(5)))
print("Squares via delegation:", list(squares_via_delegate(5)))


### 4.3 Context managers
Use `with` to guarantee cleanup. Write your own with `contextlib`.


In [None]:
from contextlib import contextmanager

@contextmanager
def open_file(name, mode):
    f = open(name, mode, encoding='utf-8')
    try:
        yield f
    finally:
        f.close()

with open_file("demo_temp.txt", "w") as f:
    _ = f.write("Hello, context manager!")
with open_file("demo_temp.txt", "r") as f:
    print("File says:", f.read())


### 4.4 Memoization & typing
`functools.lru_cache` adds transparent caching to pure functions.  
Type hints help document and validate contracts (use `mypy` separately in your environment).


In [None]:
from functools import lru_cache
from typing import List

@lru_cache(maxsize=None)
def fib(n: int) -> int:
    if n < 2:
        return n
    return fib(n-1) + fib(n-2)

def average(nums: List[int]) -> float:
    return sum(nums) / len(nums) if nums else float('nan')

print("fib(10):", fib(10))
print("average([1,2,3,4]):", average([1,2,3,4]))


## 5. Wrap-up
You’ve reviewed:
- Internals of key containers (list, set, tuple, dict) and their performance trade-offs.
- Python “oaths”: Zen, identity vs equality, mutability pitfalls, and GC basics.
- BFS/DFS plus Dijkstra using `heapq`.
- Advanced patterns with `itertools`, generators, context managers, memoization, and typing.

**Suggested follow-ups:**
- Explore `collections` (`deque`, `UserDict`, `UserList`), `concurrent.futures`, `asyncio`.
- Try profiling with `timeit`, `cProfile`, `line_profiler` to guide optimizations.
