**⭐ 1. What This Pattern Solves**

Parsing semi-structured text (logs, config files, query strings) into usable data.

ETL pipelines that ingest key=value formats from files, APIs, or streams.

Transformations where text needs normalization into dictionaries for analytics.

Prepares data for aggregations, filtering, or enrichment in downstream pipelines.

**⭐ 2. SQL Equivalent**

In [0]:
%sql
-- SQL doesn't natively parse key=value strings, but conceptually:
SELECT 
    SPLIT_PART(kv, '=', 1) AS key,
    SPLIT_PART(kv, '=', 2) AS value
FROM raw_table, UNNEST(string_to_array(raw_text, ' ')) AS kv;

**⭐ 3. Core Idea**

Split strings by a delimiter, map into a dictionary, handle missing or malformed values gracefully.

**⭐ 4. Template Code (MEMORIZE THIS)**

In [0]:
# TEMPLATE
def parse_kv_string(s: str, item_sep=' ', kv_sep='=') -> dict:
    result = {}
    for item in s.split(item_sep):
        if kv_sep in item:
            k, v = item.split(kv_sep, 1)
            result[k.strip()] = v.strip()
    return result

**⭐ 5. Detailed Example**

In [0]:
log_line = "user=alice action=login status=success"
parsed = parse_kv_string(log_line)
print(parsed)
# {'user': 'alice', 'action': 'login', 'status': 'success'}


Step-by-step:

Split by ' ' → ['user=alice', 'action=login', 'status=success']

Split each by '=' → key-value pairs

Strip spaces → build dictionary

**⭐ 6. Mini Practice Problems**

Parse "id=123 type=premium region=us-east" into a dictionary.

Handle "user=alice action=login status=" (empty value) gracefully.

Parse "key1=val1;key2=val2" using ; as item separator.

**⭐ 7. Full Data Engineering Scenario**

Problem: Logs from a web server are in key=value format:

In [0]:
time=2025-12-16 user=bob action=click page=home
time=2025-12-16 user=alice action=login page=login

[
  {'time': '2025-12-16', 'user': 'bob', 'action': 'click', 'page': 'home'},
  {'time': '2025-12-16', 'user': 'alice', 'action': 'login', 'page': 'login'}
]

parsed_logs = [parse_kv_string(line) for line in log_lines]


**⭐ 8. Time & Space Complexity**

Time Complexity: O(n) — n = number of characters in the string

Space Complexity: O(k) — k = number of key-value pairs

**⭐ 9. Common Pitfalls & Mistakes**

❌ Forgetting to strip whitespace → keys/values with leading/trailing spaces
❌ Not handling missing = → KeyError or ValueError
❌ Splitting on first = only → prevents errors when values contain = ✔ Correct: split(kv_sep, 1)
❌ Using regex unnecessarily — simple split is faster and readable