# 3. Comprehensions: Concise Data & Collection Crafting

Comprehensions are a hallmark of Python, providing a concise and readable way to create collections like lists, sets, and dictionaries in a single line of code. They are often more efficient and "Pythonic" than using a standard `for` loop for the same task. Think of them as a blueprint for rapidly assembling a new dataset from an existing one.

- An alternative, one-line syntax for creating collections.
- The same result can always be achieved with a standard `for` loop.
- Best suited for creating collections based on straightforward logic; for complex, multi-step logic, a `for` loop is often more readable.
- Using comprehensions with `any()` and `all()` for powerful data checks.

## 3.1. List Comprehension: Building Lists On-the-Fly
- Syntax: [expression for item in iterable if condition]
- Note the square brackets [ ] for lists!

In [None]:
# Create a list of the first 10 signal readings, each amplified by a factor of 2
amplified_signals = [signal * 2 for signal in range(1, 11)] 
# -> [2, 4, 6, 8, 10, 12, 14, 16, 18, 20]

# For comparison, the standard 'for' loop method:
amplified_signals_loop = []
for signal in range(1, 11):
    amplified_signals_loop.append(signal * 2)

# Another example with a conditional statement
signal_strengths = [34, 55, 61, 22, 98, 45, 15]
high_priority_signals = [signal for signal in signal_strengths if signal > 50]
# -> [55, 61, 98]

# RECOMMENDATION: If a comprehension becomes longer than one line or hard to read =  use a standard `for` loop for clarity.

## 3.2. Dictionary Comprehension: Crafting Key-Value Records
- Syntax: {key_expression: value_expression for item in iterable}
- Note the curly braces { } and the key: value pair!

In [None]:
# Create a dictionary mapping artifact IDs to their calculated hash values (as an example)
artifact_ids = [101, 102, 103] # An iterable collection
artifact_hashes = {f"ID-{item_id}": item_id ** 2 for item_id in artifact_ids}
# -> {'ID-101': 10201, 'ID-102': 10404, 'ID-103': 10609}

# Dictionaries are collections of key-value pairs.

## 3.3. Set Comprehension: Assembling Unique Datasets
- Syntax: {expression for item in iterable}
- Note the curly braces { }, but without a key:value pair.

In [None]:
# Create a set of unique signal signatures from a list with duplicates
raw_pings = [10, 20, 15, 20, 10, 30, 15] # An iterable collection
unique_ping_signatures = {ping ** 2 for ping in raw_pings}
# -> {400, 100, 900, 225} (order is not guaranteed, duplicates are removed)

# Sets are unordered, unindexed collections of unique items.

## 3.4. Generator Expressions (not Tuple Comprehensions!)
- Syntax: (expression for item in iterable)
- Note the round parentheses ( )!

In [None]:
# This creates a **generator object**.
data_stream = [1, 2, 3, 4, 5]
processed_stream_generator = (value ** 2 for value in data_stream)
# -> <generator object <genexpr> at 0x...>

# To create a TUPLE using comprehension-like syntax:
processed_stream_tuple = tuple(value ** 2 for value in data_stream)
# -> (1, 4, 9, 16, 25)

# A generator is a memory-efficient iterator that yields values one by one, on demand,
# rather than building the entire collection in memory at once.

## 3.5. System Checks with `any()` and `all()`
- `any(iterable)`: Returns `True` if **at least one** item in the iterable is true.
- `all(iterable)`: Returns `True` if **all** items in the iterable are true.
- These are often combined with generator expressions for powerful, readable checks.

In [None]:
mission_logs = ["Status: Nominal", "Anomaly Detected!", "Status: Nominal", "System Error"]
target_keyword = "Anomaly"

# `any()` - check if the target keyword is present in *any* of the logs
is_target_present = any(target_keyword in log for log in mission_logs)

if is_target_present:
    print(f"Alert: The keyword '{target_keyword}' was found in at least one log entry!") # -> This will print
else:
    print("Keyword not found in logs.")


# `all()` - check if *all* log entries are of type 'string'
all_are_strings = all(isinstance(log, str) for log in mission_logs)
# This creates a generator of True/False values, `all()` checks if all are True.

if all_are_strings:
    print("Data Integrity Check: All log entries are strings.")

## practise

1.  **Level: Easy (List Comprehension)**
    - You are given a list of `agent_callsigns` with inconsistent formatting.
    ```python
    agent_callsigns = [" pathfinder", "SPECTRE", "  Vanguard", "seeker", "Nomad  ", "alpha"]
    ```
    - Using a **list comprehension**, create a new list that contains only the callsigns that start with the letter 'p' or 'v'. The names in the new list should be cleaned of leading/trailing whitespace and have their first letter capitalized (title case).

---

2.  **Level: Medium (Set Comprehension)**
    - You have a log of all artifacts recovered during multiple missions, with many duplicates.
    ```python
    recovered_artifacts_log = [
        "Energy Crystal", "Ancient Scroll", "Energy Crystal", "Data Chip", "Locator Beacon",
        "Data Chip", "Energy Crystal", "Alien Alloy", "Locator Beacon", "Ancient Scroll"
    ]
    ```
    - **a)** Using **set comprehension**, create a set of `unique_artifacts` that were recovered.
    - **b)** Using **set comprehension**, create a set of `long_artifact_names` containing only the artifacts with names longer than 10 characters.
    - **c)** Using **set comprehension**, create a set of `crystal_artifacts` containing only the artifacts that include the word "Crystal".

---

3.  **Level: Medium (List Comprehension with Dictionaries)**
    - You have personnel records as a list of dictionaries.
    ```python
    personnel_records = [
        {"name": "Alice", "age": 32, "department": "Science", "clearance": 4},
        {"name": "Bob", "age": 24, "department": "Engineering", "clearance": 3},
        {"name": "Charlie", "age": 38, "department": "Security", "clearance": 5},
        {"name": "Dana", "age": 45, "department": "Command", "clearance": 5},
        {"name": "Eve", "age": 29, "department": "Engineering", "clearance": 4}
    ]
    ```
    - **a)** Using **list comprehension**, create a list of the names of all personnel older than 30.
    - **b)** Using **list comprehension**, create a list of the names of all personnel in the "Engineering" department.
    - **c)** Using **list comprehension**, create a list of names of personnel who have a clearance level of 5.

---

4.  **Level: Medium / Hard (Dictionary Comprehension)**
    - You have a dictionary of `operative_skill_scores`.
    ```python
    operative_skill_scores = {
        "Pathfinder": 92,
        "Spectre": 78,
        "Vanguard": 85,
        "Seeker": 95,
        "Nomad": 69
    }
    ```
    - **a)** Using **dictionary comprehension**, create a new dictionary `operative_status` where operatives with a score of 85 or higher are assigned the status `"Elite"`, and all others are assigned `"Field-Ready"`.
    - **b)** Using **dictionary comprehension**, create a new dictionary `normalized_scores` where each operative's score is converted to a 0.0 - 1.0 scale.
    - **c)** Using **dictionary comprehension**, create a new dictionary `needs_retraining` containing only the operatives with a score below 80.

---

5.  **Level: Medium (Using `any()` and `all()`)**
    - You have data on exploration team performance, where each member has a list of scores from recent simulation runs.
    ```python
    team_performance_log = [
        ("Pathfinder", [8, 9, 7]),
        ("Spectre", [9, 9, 10]),
        ("Vanguard", [7, 8, 8]),
        ("Seeker", [10, 10, 10]),
        ("Nomad", [6, 7, 8])
    ]
    ```
    - **a)** Using `any()` and a generator expression, check if there is **at least one** operative who scored a perfect `10` in any of their simulations. Print `True` or `False`.
    - **b)** Using `all()` and a generator expression, check if **all** operatives have an average score greater than `7`. Print `True` or `False`.

---
#### © Jiří Svoboda (George Freedom)
- Web: https://GeorgeFreedom.com
- LinkedIn: https://www.linkedin.com/in/georgefreedom/
- Book me: https://cal.com/georgefreedom