# 14 - List Comprehensions

## Introduction

List comprehensions are a concise and Pythonic way to create lists. They're faster than traditional loops and are widely used in data processing. Understanding them will help you when working with PySpark transformations.

## What You'll Learn

- Basic list comprehensions
- List comprehensions with conditions
- Nested list comprehensions
- When to use list comprehensions vs loops


## Basic List Comprehension

Instead of using a loop to create a list, you can use a list comprehension in a single line.


In [1]:
# Traditional way (using loop)
squares_loop = []
for x in range(5):
    squares_loop.append(x ** 2)
print("Using loop:", squares_loop)

# List comprehension way (more Pythonic)
squares_comp = [x ** 2 for x in range(5)]
print("Using comprehension:", squares_comp)


Using loop: [0, 1, 4, 9, 16]
Using comprehension: [0, 1, 4, 9, 16]


## List Comprehension with Conditions

You can add conditions to filter elements in list comprehensions.


In [2]:
# Get only even numbers
numbers = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
even_numbers = [x for x in numbers if x % 2 == 0]
print("Even numbers:", even_numbers)

# Get squares of even numbers only
even_squares = [x ** 2 for x in numbers if x % 2 == 0]
print("Squares of even numbers:", even_squares)


Even numbers: [2, 4, 6, 8, 10]
Squares of even numbers: [4, 16, 36, 64, 100]


## List Comprehension with Transformations

You can apply transformations to elements while creating the list.


In [3]:
# Convert strings to uppercase
words = ["hello", "world", "python", "data"]
uppercase_words = [word.upper() for word in words]
print("Uppercase words:", uppercase_words)

# Get lengths of words
word_lengths = [len(word) for word in words]
print("Word lengths:", word_lengths)


Uppercase words: ['HELLO', 'WORLD', 'PYTHON', 'DATA']
Word lengths: [5, 5, 6, 4]


## Nested List Comprehensions

You can use nested loops in list comprehensions.


In [4]:
# Create a multiplication table
multiplication_table = [[i * j for j in range(1, 6)] for i in range(1, 6)]
print("Multiplication table (5x5):")
for row in multiplication_table:
    print(row)


Multiplication table (5x5):
[1, 2, 3, 4, 5]
[2, 4, 6, 8, 10]
[3, 6, 9, 12, 15]
[4, 8, 12, 16, 20]
[5, 10, 15, 20, 25]


## Practical Example: Processing Data

List comprehensions are great for data processing tasks.


In [5]:
# Process a list of prices (add tax, filter high prices)
prices = [10.99, 25.50, 5.00, 100.00, 15.75, 200.00]

# Add 10% tax to prices above $20
prices_with_tax = [price * 1.10 if price > 20 else price for price in prices]
print("Prices with tax:", prices_with_tax)

# Get only prices above $15
high_prices = [price for price in prices if price > 15]
print("High prices:", high_prices)


Prices with tax: [10.99, 28.05, 5.0, 110.00000000000001, 15.75, 220.00000000000003]
High prices: [25.5, 100.0, 15.75, 200.0]


## Key Points to Remember

- List comprehensions are more concise and often faster than loops
- Use them when creating new lists from existing data
- They're widely used in data processing and will help you understand PySpark transformations
- If the logic is too complex, a regular loop might be more readable
- Similar concepts exist in PySpark for DataFrame transformations
