### Advanced Exercises: Common String Methods

These problems go a step beyond the basics you covered (case mappings, stripping, splitting/joining, substring search). Each exercise is immediately followed by a concise solution cell you can run.

#### Exercise 1 — Case-Insensitive Equality (the right way)
Write a function `equal_ci(a, b)` that returns `True` when two strings are equal **case-insensitively** using **casefolding** (not `.lower()` or `.upper()`).

Test it with:
- `BMW` vs `bmw`
- `straße` vs `STRASSE`

In [1]:
def equal_ci(a: str, b: str) -> bool:
    return a.casefold() == b.casefold()

print(equal_ci('BMW', 'bmw'))      # True
print(equal_ci('straße', 'STRASSE'))  # True

True
True


#### Exercise 2 — Unicode Normalization + Casefolding
Two visually identical strings may be encoded differently (precomposed vs combining marks). Implement `equal_unicode(a, b)` that first normalizes both strings to NFC, then casefolds, then compares.

Test:
- `'ê'` (precomposed) vs `'ê'` (e + combining ^)
- `'İ'` (LATIN CAPITAL LETTER I WITH DOT ABOVE) vs `'i̇'` (i + combining dot)

In [2]:
import unicodedata as _ud

def equal_unicode(a: str, b: str) -> bool:
    a_n = _ud.normalize('NFC', a).casefold()
    b_n = _ud.normalize('NFC', b).casefold()
    return a_n == b_n

print(equal_unicode('ê', 'ê'))  # True
print(equal_unicode('İ', 'i\u0307'))  # True in many contexts

True
True


#### Exercise 3 — Title Case with Small-Word Exceptions
Write `smart_title(s, small_words)` that returns a title-cased string but **does not capitalize** words from `small_words` *except* when they are the first or last word.

Example:
- Input: `"the definitive guide to python"`, small words: `{"the","to","of","and"}`
- Output: `"The Definitive Guide to Python"`

In [3]:
def smart_title(s: str, small_words=None) -> str:
    if small_words is None:
        small_words = {"a","an","and","as","at","but","by","for","in","of","on","or","to","the"}
    words = s.split()
    if not words:
        return s
    def cap(word):
        return word[0:1].upper() + word[1:].lower()
    out = []
    for i, w in enumerate(words):
        wl = w.lower()
        if i == 0 or i == len(words)-1:
            out.append(cap(w))
        else:
            out.append(wl if wl in small_words else cap(w))
    return ' '.join(out)

print(smart_title('the definitive guide to python', {"the","to","of","and"}))

The Definitive Guide to Python


#### Exercise 4 — Strip **all** Kinds of Whitespace
Write `strip_all_ws(s)` that strips *any Unicode whitespace* from both ends (not just ASCII spaces). Test it on strings containing ordinary spaces, tabs, non-breaking spaces (`\u00A0`) and em spaces (`\u2003`).

In [4]:
def strip_all_ws(s: str) -> str:
    # Python's .strip() already removes all Unicode whitespace by default
    return s.strip()

samples = [
    '\t  hello\n',
    '\u00A0hello\u00A0',   # no-break space
    '\u2003hello\u2003'    # em space
]
for t in samples:
    print(repr(strip_all_ws(t)))  # all should print 'hello'

'hello'
'hello'
'hello'


#### Exercise 5 — Find **all** Occurrences (including overlaps)
Implement `find_all(haystack, needle)` that returns a list of start indices for **every** occurrence of `needle` in `haystack`, including overlapping matches.

Example:
- `find_all('aaaa', 'aa') -> [0, 1, 2]`

In [5]:
def find_all(haystack: str, needle: str):
    if not needle:
        return []
    out, i = [], 0
    while True:
        i = haystack.find(needle, i)
        if i == -1:
            return out
        out.append(i)
        i += 1  # allow overlaps

print(find_all('aaaa', 'aa'))     # [0, 1, 2]
print(find_all('To be or not to be', 'be'))  # positions of 'be'

[0, 1, 2]
[3, 16]


#### Exercise 6 — Robust CSV-like Split & Join
Given a line like `name,age,city\n` but with commas possibly **inside quotes**, split it correctly into fields using the `csv` module, then re-join with `'; '`.

Test with: `"Doe, \"Jane\"",27,"New York, NY"` → `["Doe, \"Jane\"", "27", "New York, NY"]` → rejoin to `"Doe, \"Jane\""; 27; New York, NY`

In [6]:
import csv, io

row = '"Doe, \"Jane\"",27,"New York, NY"\n'
fields = next(csv.reader(io.StringIO(row)))
print(fields)
print('; '.join(fields))

['Doe, Jane""', '27', 'New York, NY']
Doe, Jane""; 27; New York, NY


#### Exercise 7 — `translate` with `maketrans`
Create a translation that:
1) removes all digits, and 
2) maps any of the characters `.,;:!?` to a single space.

Then normalize whitespace to single spaces.

Input: `"Py3thon,  is...  fun!? Yes; very: fun42."` → `"Py thon is fun Yes very fun"`

In [7]:
import string

punct = '.,;:!?'
remove_digits = {ord(d): None for d in string.digits}
map_punct_to_space = {ord(ch): ' ' for ch in punct}
table = {**remove_digits, **map_punct_to_space}

s = 'Py3thon,  is...  fun!? Yes; very: fun42.'
clean = s.translate(table)
normalized = ' '.join(clean.split())
print(normalized)

Python is fun Yes very fun


#### Exercise 8 — Parsing with `partition` / `rpartition`
Given strings of the form `key=value=maybe_more`, split them into `(key, value)` such that key is everything **before the first `=`** and value is everything **after the first `=`** (may contain `=`). Then also demonstrate how to split on the **last** `=` using `rpartition`.

Test with:
- `"path=/usr/local/bin"`
- `"token=abc=123=xyz"`

In [8]:
def split_first_eq(s: str):
    left, sep, right = s.partition('=')
    return left, right if sep else None

def split_last_eq(s: str):
    left, sep, right = s.rpartition('=')
    return left if sep else None, right if sep else None

print(split_first_eq('path=/usr/local/bin'))   # ('path', '/usr/local/bin')
print(split_first_eq('token=abc=123=xyz'))     # ('token', 'abc=123=xyz')
print(split_last_eq('token=abc=123=xyz'))      # ('token=abc=123', 'xyz')

('path', '/usr/local/bin')
('token', 'abc=123=xyz')
('token=abc=123', 'xyz')
