# From Python to Production
## Notebook 9 ‚Äî Idioms & Best Practices

By **Prerna Joshi** | #25DaysOfDataTech 

"Write Python the Pythonic way ‚Äî small habits turn into production-ready code."

---

### What you'll learn
- Pythonic idioms: `enumerate`, `zip`, unpacking, truthiness, the walrus `:=`
- EAFP vs LBYL, context managers, `pathlib`, and resource safety
- Collection patterns: comprehensions, `dict.get`, `setdefault`, `defaultdict`, `Counter`
- Functional touches: generators, iterators, `itertools`, `operator`
- Error handling & logging, guard clauses, ‚Äúfail fast‚Äù design
- Typing & docstrings for maintainability; dataclasses for records
- Performance & profiling tips; avoiding common code smells
- Testing mindset & small refactors that scale


> **Why this matters for data work**  
> Clean, idiomatic code is easier to review, test, and productionize. These patterns help you move faster without creating tech debt.


## 1. Pythonic Building Blocks


In [1]:
# enumerate: index + value
names = ["alice", "bob", "carol"]
idx_pairs = [(i, n) for i, n in enumerate(names, start=1)]

# zip: align iterables; zip longest with itertools.zip_longest
scores = [91, 78, 88]
paired = list(zip(names, scores))

# tuple unpacking
a, b = (1, 2)
head, *mid, tail = [1,2,3,4,5]

idx_pairs, paired, (a, b), (head, mid, tail)


([(1, 'alice'), (2, 'bob'), (3, 'carol')],
 [('alice', 91), ('bob', 78), ('carol', 88)],
 (1, 2),
 (1, [2, 3, 4], 5))

## 2. Truthiness & None checks

- Prefer `if items:` over `if len(items) > 0:`  
- Use `is None` / `is not None` for sentinel checks.


In [2]:
items = []
opt = None
bool(items), (opt is None), (opt == None)  # last form is discouraged


(False, True, True)

## 3. EAFP vs LBYL

- **EAFP** (Easier to Ask Forgiveness than Permission): try/except around operations.  
- **LBYL** (Look Before You Leap): check preconditions first.  
Prefer EAFP for concurrent I/O situations and when it simplifies code.


In [3]:
def parse_int_eafp(s, default=None):
    try:
        return int(s)
    except (TypeError, ValueError):
        return default

def parse_int_lbyl(s, default=None):
    return int(s) if isinstance(s, (int, str)) and str(s).isdigit() else default

parse_int_eafp("10"), parse_int_eafp("x", -1), parse_int_lbyl("42")


(10, -1, 42)

## 4. Context Managers & `pathlib`

Use `with` to ensure cleanup. Prefer `pathlib.Path` for paths.


In [4]:
from pathlib import Path

p = Path("demo.txt")
with p.open("w", encoding="utf-8") as f:
    f.write("hello\nworld")

with p.open(encoding="utf-8") as f:
    lines = f.read().splitlines()

lines, p.exists(), p.resolve().name


(['hello', 'world'], True, 'demo.txt')

## 5. Comprehensions & Dict Helpers


In [5]:
from collections import defaultdict, Counter

data = [("alice", 91), ("bob", 78), ("carol", 88), ("alice", 95)]
# dict comprehension
passed = {name: score for name, score in data if score >= 85}
# setdefault for grouping (one-liner)
group = {}
for name, score in data:
    group.setdefault(name, []).append(score)

# defaultdict alternative
group2 = defaultdict(list)
for name, score in data:
    group2[name].append(score)

# Counter for frequencies
freq = Counter(word.lower() for word in "To be or not to be".split())

passed, group, dict(group2), freq.most_common()


({'alice': 95, 'carol': 88},
 {'alice': [91, 95], 'bob': [78], 'carol': [88]},
 {'alice': [91, 95], 'bob': [78], 'carol': [88]},
 [('to', 2), ('be', 2), ('or', 1), ('not', 1)])

## 6. Generators, `yield`, and `itertools`


In [6]:
from itertools import islice, pairwise

def tokens(text):
    return (t for t in text.lower().split())

def moving_sum(seq):
    total = 0
    for x in seq:
        total += x
        yield total

list(islice(tokens("Clean Code In Python"), 3)), list(moving_sum([1,2,3,4])), list(pairwise([1,2,3]))


(['clean', 'code', 'in'], [1, 3, 6, 10], [(1, 2), (2, 3)])

## 7. Walrus Operator `:=` ‚Äî Assign in Expressions

Use sparingly to reduce repetition in loops and conditionals.


In [7]:
def find_first_long_word(words, n=5):
    for w in words:
        if (L := len(w)) > n:
            return w, L
    return None

find_first_long_word(["ai","ml","python","devops"], n=3)


('python', 6)

## 8. Sorting Idioms


In [8]:
import operator as op

rows = [
    {"name":"alice","score":91},
    {"name":"bob","score":78},
    {"name":"carol","score":88},
]
by_score_desc = sorted(rows, key=op.itemgetter("score"), reverse=True)
names_natural = sorted({r["name"] for r in rows})  # set ‚Üí unique then sort

by_score_desc, names_natural


([{'name': 'alice', 'score': 91},
  {'name': 'carol', 'score': 88},
  {'name': 'bob', 'score': 78}],
 ['alice', 'bob', 'carol'])

## 9. Logging > `print` for Production

- Use `logging` with levels (`DEBUG`, `INFO`, `WARNING`, `ERROR`)  
- Keep prints for quick local debugging


In [9]:
import logging
logging.basicConfig(level=logging.INFO, format="%(levelname)s:%(message)s")
log = logging.getLogger("demo")

log.info("Job started")
for i in range(2):
    log.debug("loop %s", i)
log.warning("Using a default value")
"done"


INFO:Job started


'done'

## 10. Errors, Guard Clauses, and 'Fail Fast'

Validate early and return early to keep functions flat and readable.


In [10]:
def normalize_non_empty(s: str) -> str:
    if s is None:
        raise TypeError("s cannot be None")
    s = s.strip()
    if not s:
        raise ValueError("empty after strip")
    return s.lower()

normalize_non_empty("  Hello  ")


'hello'

## 11. Typing & Docstrings ‚Äî Communicate Intent


In [11]:
from typing import Iterable

def mean(xs: Iterable[float]) -> float:
    """Return arithmetic mean of a finite iterable of floats."""
    xs = list(xs)
    return sum(xs) / len(xs) if xs else float("nan")

mean([1.0, 2.0, 3.0])


2.0

## 12. Dataclasses as Lightweight Records


In [12]:
from dataclasses import dataclass, field

@dataclass
class RunConfig:
    seed: int = 0
    tags: list[str] = field(default_factory=list)

cfg = RunConfig(seed=42); cfg.tags.append("ml")
cfg


RunConfig(seed=42, tags=['ml'])

## 13. Resource Helpers ‚Äî `contextlib`


In [13]:
import contextlib, os

# suppress specific exceptions
with contextlib.suppress(FileNotFoundError):
    os.remove("file-that-may-not-exist.txt")

# ExitStack: dynamically manage multiple contexts
from pathlib import Path
files = ["f1.txt","f2.txt"]
with contextlib.ExitStack() as stack:
    handles = [stack.enter_context(Path(f).open("w", encoding="utf-8")) for f in files]
    for i, h in enumerate(handles, 1):
        h.write(f"file {i}\n")

[Path(f).read_text(encoding="utf-8").strip() for f in files]


['file 1', 'file 2']

## 14. Performance & Profiling

- Measure first: `timeit` (micro), `cProfile`/`snakeviz` (macro)  
- Prefer algorithmic wins and vectorization; avoid micro-obsessing early


In [None]:
# quick micro-benchmark demo (numbers here are illustrative)
import timeit
t_join = timeit.timeit('"-".join(str(i) for i in range(1000))', number=200)
t_plus = timeit.timeit(
    """
s = ""
for i in range(1000):
    s += str(i) + "-"
""",
    number=200,
)
round(t_join, 4), round(t_plus, 4)


(0.0207, 0.0333)

## 15. Code Smells & Small Refactors

- Long functions ‚Üí extract helpers
- Deep nesting ‚Üí guard clauses / early returns
- Repetition ‚Üí utility functions; DRY responsibly
- Mutable global state ‚Üí pass dependencies explicitly
- Hidden magic numbers ‚Üí named constants


## 16. Testing Mindset

- Keep functions pure when possible ‚Üí easier to test
- Use small fixtures; test both happy & edge paths
- Write tests for bugs (regressions) first


## 17. Practice (Try first, then reveal solutions)

1. **pairs_with_index**: Using `enumerate`, return `(idx, value)` for items longer than 3 chars.  
2. **safe_read**: Use `pathlib` + context manager to read text or return `""` if file missing (no custom try/except if you use `suppress`).  
3. **group_grades**: From `(name, score)` pairs, build dict of name ‚Üí list of scores using `setdefault` in a loop.  
4. **top_k_tokens**: Generator pipeline to yield top‚Äëk tokens by frequency from a string.  
5. **first_match**: Using walrus, return first item whose `pred(x)` is true along with its index.  
6. **log_wrap**: A tiny decorator that logs function name before/after. Use `functools.wraps`.  
7. **n_sorted_unique**: Return the sorted unique values from a list (use set + sorted).  
8. **merge_two_dicts**: Merge two dicts with right‚Äëbias (Python 3.9+ `|`), fallback to unpacking for older.  
9. **safe_divide**: EAFP style: divide `a/b`; on errors log warning and return `default`.  
10. **to_slug**: Idiomatic slugify using lower + split + `"-".join`.  
11. **sorted_by_keypath**: Sort list of dicts by nested key `"a.b"` (use `operator` or a small getter).  
12. **time_fn**: Context manager that times a code block and stores elapsed on the object.


## 18. Practice Solutions  
*(Click to reveal after solving.)*

<details>
<summary><strong>Solution 1Ô∏è‚É£ ‚Äî pairs_with_index</strong></summary>

```python
def pairs_with_index(items):
    return [(i, s) for i, s in enumerate(items) if len(s) > 3]
```
</details>

<details>
<summary><strong>Solution 2Ô∏è‚É£ ‚Äî safe_read</strong></summary>

```python
from pathlib import Path
from contextlib import suppress

def safe_read(path) -> str:
    p = Path(path)
    with suppress(FileNotFoundError):
        return p.read_text(encoding="utf-8")
    return ""
```
</details>

<details>
<summary><strong>Solution 3Ô∏è‚É£ ‚Äî group_grades</strong></summary>

```python
def group_grades(pairs):
    out = {}
    for name, score in pairs:
        out.setdefault(name, []).append(score)
    return out
```
</details>

<details>
<summary><strong>Solution 4Ô∏è‚É£ ‚Äî top_k_tokens</strong></summary>

```python
from collections import Counter

def top_k_tokens(text, k=3):
    toks = (t for t in text.lower().split() if t.isalpha())
    cnt = Counter(toks)
    return cnt.most_common(k)
```
</details>

<details>
<summary><strong>Solution 5Ô∏è‚É£ ‚Äî first_match</strong></summary>

```python
def first_match(items, pred):
    for i, x in enumerate(items):
        if (ok := pred(x)):
            return i, x
    return None
```
</details>

<details>
<summary><strong>Solution 6Ô∏è‚É£ ‚Äî log_wrap</strong></summary>

```python
import logging
from functools import wraps

def log_wrap(fn):
    @wraps(fn)
    def wrapper(*args, **kwargs):
        logging.info("start %s", fn.__name__)
        try:
            return fn(*args, **kwargs)
        finally:
            logging.info("end %s", fn.__name__)
    return wrapper
```
</details>

<details>
<summary><strong>Solution 7Ô∏è‚É£ ‚Äî n_sorted_unique</strong></summary>

```python
def n_sorted_unique(seq):
    return sorted(set(seq))
```
</details>

<details>
<summary><strong>Solution 8Ô∏è‚É£ ‚Äî merge_two_dicts</strong></summary>

```python
def merge_two_dicts(a, b):
    try:
        return a | b  # Python 3.9+
    except TypeError:
        c = a.copy()
        c.update(b)
        return c
```
</details>

<details>
<summary><strong>Solution 9Ô∏è‚É£ ‚Äî safe_divide</strong></summary>

```python
import logging

def safe_divide(a, b, default=None):
    try:
        return a / b
    except Exception as e:
        logging.warning("divide failed: %s", e)
        return default
```
</details>

<details>
<summary><strong>Solution üîü ‚Äî to_slug</strong></summary>

```python
def to_slug(s: str) -> str:
    return "-".join(s.lower().split())
```
</details>

<details>
<summary><strong>Solution 1Ô∏è‚É£1Ô∏è‚É£ ‚Äî sorted_by_keypath</strong></summary>

```python
def get_keypath(d, path):
    cur = d
    for part in path.split("."):
        cur = cur.get(part, None) if isinstance(cur, dict) else None
    return cur

def sorted_by_keypath(rows, path):
    return sorted(rows, key=lambda r: get_keypath(r, path))
```
</details>

<details>
<summary><strong>Solution 1Ô∏è‚É£2Ô∏è‚É£ ‚Äî time_fn (context manager)</strong></summary>

```python
import time

class time_fn:
    def __enter__(self):
        self.t0 = time.perf_counter()
        return self
    def __exit__(self, exc_type, exc, tb):
        self.elapsed = (time.perf_counter() - self.t0) * 1000
        return False  # don't suppress
```
</details>
