<a href="https://colab.research.google.com/github/BrianEsquivel-hexaware/PythonAcademy/blob/main/CorePython/Python_introduction_4.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# List Comprehension

Python Comprehensions are expressions to create new **lists**, **dictionaries**, **sets** and **generators** based on the values of existing iterables. They are an elegant way to make your code much more concise and readable in a single line of code.

A Python Comprehension consists of an expression followed by a `for` clause, and optionally one or more `if` clauses. The expression is evaluated for each item in the list specified in the `for` clause, and the result is a new list containing the results of each evaluation.

Here's the basic syntax:

```python
 newlist = [expression for item in iterable if condition]
```
- The return value is a new list, leaving the old list unchanged.

- We can also add a condition at the end of a list comprehension. This lets us **filter** and **transform** at the same time.

Here's an example of a list comprehension that squares the numbers in a list:

In [1]:
numbers = [1, 2, 3, 4, 5]
squared_numbers = [x**2 for x in numbers]
print(squared_numbers)

[1, 4, 9, 16, 25]


In [10]:
original_list = list(range(21))
print(original_list)
even_numbers = [x for x in original_list if x % 2 == 0]
print(even_numbers)

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20]
[0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20]


In [13]:
(2/2).is_integer()

True

### Other Comprehensions: Dictionaries and Sets

The same idea works for **dictionaries** and **sets**:

- **Set comprehension**: `{expression for item in iterable}`
- **Dict comprehension**: `{key_expression: value_expression for item in iterable}`

This is very useful when creating mappings or unique sets from data.

In [14]:
names = ["Alice", "Bob", "Charlie", "Alice"]

# Set comprehension: unique names (like DISTINCT)
unique_names = {name for name in names}
print("Unique names:", unique_names)

# Dict comprehension: map name -> length
name_lengths = {name: len(name) for name in names}
print("Name lengths:", name_lengths)

Unique names: {'Alice', 'Charlie', 'Bob'}
Name lengths: {'Alice': 5, 'Bob': 3, 'Charlie': 7}


In [16]:
list(enumerate(names, 1))

[(1, 'Alice'), (2, 'Bob'), (3, 'Charlie'), (4, 'Alice')]

In [20]:
{name: idx for idx, name in enumerate(names, 1)}

{'Alice': 4, 'Bob': 2, 'Charlie': 3}

In [19]:
# What if we donÂ´t want any of the two variables? -> Use placeholder
{idx for idx, _ in enumerate(names, 1)}

{1, 2, 3, 4}

# Optimizing in python

Optimizing in Python depends specifically on the problem you want to solve, and on the features and bottlenecks of the program.

## Generators

Generator functions are a special kind of function that return a lazy iterator. These are objects that you can loop over like a list. However, unlike lists, lazy iterators do not store their contents in memory.

You can also define a generator expression (also called a generator comprehension), which has a very similar syntax to list comprehensions. In this way, you can use the generator without calling a function:
```python
csv_gen = (row for row in open(file_name))
```

## Example: Generator for Order IDs

Generators are very useful when we want to **produce values on demand** instead of all at once.

A realistic example in data engineering is generating **sequential IDs** for orders, invoices, or tickets.

We can create a generator that yields one new order ID every time we iterate over it.

In [21]:
def order_id_generator(prefix="ORD", start=1000):
    """
    Generate sequential order IDs with a given prefix.

    Examples:
    - ORD-1000
    - ORD-1001
    - ORD-1002
    ...
    """
    current = start
    while True:
        yield f"{prefix}-{current}"
        current += 1


In [22]:
# Example: get the first 5 order IDs
ids = order_id_generator(prefix="ORD", start=1000)

for _ in range(5):
    print(next(ids))

ORD-1000
ORD-1001
ORD-1002
ORD-1003
ORD-1004


In [26]:
# Or, if you prefer a for loop with a break (to show infinite generator + break):
ids = order_id_generator(prefix="ORD", start=1000)

for order_id in ids:
    print(order_id)
    if order_id == "ORD-1004":
        break  # stop after 5 IDs

ORD-1000
ORD-1001
ORD-1002
ORD-1003
ORD-1004


Later, when we work with files or databases, we can:

- Read data from a CSV or a table
- Use a generator like `order_id_generator()` to assign IDs
- Write the enriched data back to another file

The important idea is that the generator **produces values one by one**,
which is very convenient for streaming data.

### Exercise:

* Write a generator that generates the first n even numbers (where n is a positive integer passed as an argument to the generator).
* Modify the above generator to also generate the first n odd numbers.
* Use the generator to print the first 10 even and odd numbers respectively.

In [35]:
def number_generator(n):
  if n < 0:
    return

  num = 0
  for _ in range(n):
    yield num
    num += 1

even_numbers = (x for x in number_generator(20) if x % 2 == 0)
odd_numbers = (x for x in number_generator(20) if x % 2 != 0)
print("Even numbers:")
for num in even_numbers:
  print(num)

print("Odd numbers:")
for num in odd_numbers:
  print(num)


Even numbers:
0
2
4
6
8
10
12
14
16
18
Odd numbers:
1
3
5
7
9
11
13
15
17
19


Here's an example of using generators in a business use case scenario:

Imagine you're working for a company that deals with large amounts of data and you need to process this data in an efficient manner. You need to generate a list of all the customer IDs that have made a purchase over a certain amount in a given time period.

In [28]:
def customer_id_generator(purchase_data, min_amount):
    for customer_id, purchase_amount in purchase_data:
        if purchase_amount >= min_amount:
            yield customer_id

# Example data
purchase_data = [
    (1, 200),
    (2, 150),
    (3, 250),
    (4, 300),
    (5, 175),
    (6, 400),
    (7, 100)
]

# Use the generator to get customer IDs for purchases over $200
customer_ids = customer_id_generator(purchase_data, 200)
for id in customer_ids:
    print(id)

1
3
4
6


In this example, the `customer_id_generator` function acts as a generator that takes in the purchase data and a minimum purchase amount as arguments. It yields the customer IDs of customers who made purchases over the specified amount. This generator allows you to process the data in an efficient and memory-friendly manner, as it generates each customer ID one at a time, instead of generating the entire list at once.

> Content created by [**Carlos Cruz-Maldonado**](https://www.linkedin.com/in/carloscruzmaldonado/).  
> I am available to answer any questions or provide further assistance.   
> Feel free to reach out to me at any time.  