# Filter Function

The `filter()` function is a built-in Python function that allows you to filter elements from an iterable (like a list, tuple, etc.) based on a condition. It creates a new iterator containing only the elements that satisfy the given condition.

## Table of Contents
1. [Introduction to filter()](#introduction)
2. [Syntax and Parameters](#syntax)
3. [Basic Examples](#basic-examples)
4. [filter() with Lambda Functions](#lambda)
5. [filter() with Custom Functions](#custom-functions)
6. [filter() vs List Comprehension](#comparison)
7. [Combining filter() with map()](#combining)
8. [Common Use Cases](#use-cases)
9. [Summary](#summary)

## 1. Introduction to filter() <a id='introduction'></a>

The `filter()` function constructs an iterator from elements of an iterable for which a function returns `True`. It's a powerful tool for selecting specific elements from a collection based on a condition.

**Key Points:**
- Returns a filter object (iterator), not a list
- Only includes elements where the function returns `True`
- Memory efficient for large datasets
- Part of functional programming paradigm in Python
- The filtering function must return a boolean value (True/False)

## 2. Syntax and Parameters <a id='syntax'></a>

```python
filter(function, iterable)
```

**Parameters:**
- `function`: A function that tests each element and returns True or False
  - If `None` is passed, it removes all elements that evaluate to False
- `iterable`: The sequence to filter (list, tuple, set, etc.)

**Returns:**
- A filter object (iterator) containing only elements where the function returned True

In [None]:
# Example: Understanding filter object
numbers = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

# Applying filter() - returns a filter object
result = filter(lambda x: x % 2 == 0, numbers)
print("Filter object:", result)  # Shows filter object
print("Type:", type(result))  # <class 'filter'>

# Convert to list to see results
even_numbers = list(filter(lambda x: x % 2 == 0, numbers))
print("Even numbers:", even_numbers)

## 3. Basic Examples <a id='basic-examples'></a>

Let's start with simple examples to understand how `filter()` works.

### Example 1: Filtering Numbers

In [None]:
# Filter even numbers
numbers = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

def is_even(n):
    """Returns True if number is even"""
    return n % 2 == 0

even_numbers = list(filter(is_even, numbers))

print("Original numbers:", numbers)
print("Even numbers:", even_numbers)

In [None]:
# Filter odd numbers
numbers = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

def is_odd(n):
    """Returns True if number is odd"""
    return n % 2 != 0

odd_numbers = list(filter(is_odd, numbers))

print("Original numbers:", numbers)
print("Odd numbers:", odd_numbers)

In [None]:
# Filter numbers greater than a threshold
numbers = [5, 12, 17, 3, 25, 8, 30, 15]
threshold = 10

def greater_than_threshold(n):
    """Returns True if number is greater than 10"""
    return n > threshold

large_numbers = list(filter(greater_than_threshold, numbers))

print("Original numbers:", numbers)
print(f"Numbers greater than {threshold}:", large_numbers)

### Example 2: Filtering with None

In [None]:
# When function is None, filter removes falsy values
# Falsy values: False, None, 0, 0.0, '', [], {}, etc.

mixed_values = [0, 1, False, True, '', 'hello', None, [], [1, 2], {}, {'a': 1}]

# Remove all falsy values
truthy_values = list(filter(None, mixed_values))

print("Original values:", mixed_values)
print("Truthy values only:", truthy_values)

## 4. filter() with Lambda Functions <a id='lambda'></a>

Lambda functions are commonly used with `filter()` for concise, one-line filtering conditions.

In [None]:
# Example 1: Filter positive numbers
numbers = [-5, 3, -2, 8, -10, 15, 0, -1, 7]

positive_numbers = list(filter(lambda x: x > 0, numbers))

print("Original numbers:", numbers)
print("Positive numbers:", positive_numbers)

In [None]:
# Example 2: Filter strings by length
words = ['apple', 'hi', 'banana', 'cat', 'elephant', 'dog', 'strawberry']

# Words with more than 5 characters
long_words = list(filter(lambda word: len(word) > 5, words))

print("Original words:", words)
print("Words longer than 5 characters:", long_words)

In [None]:
# Example 3: Filter strings that start with a specific letter
fruits = ['apple', 'banana', 'avocado', 'cherry', 'apricot', 'blueberry', 'almond']

# Fruits starting with 'a'
a_fruits = list(filter(lambda fruit: fruit.startswith('a'), fruits))

print("All fruits:", fruits)
print("Fruits starting with 'a':", a_fruits)

In [None]:
# Example 4: Filter numbers in a specific range
numbers = [5, 15, 25, 35, 45, 55, 65, 75, 85]

# Numbers between 20 and 60 (inclusive)
range_numbers = list(filter(lambda x: 20 <= x <= 60, numbers))

print("Original numbers:", numbers)
print("Numbers between 20 and 60:", range_numbers)

## 5. filter() with Custom Functions <a id='custom-functions'></a>

For more complex filtering logic, custom functions provide better readability and maintainability.

In [None]:
# Example 1: Filter prime numbers
def is_prime(n):
    """Returns True if n is a prime number"""
    if n < 2:
        return False
    for i in range(2, int(n ** 0.5) + 1):
        if n % i == 0:
            return False
    return True

numbers = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20]
prime_numbers = list(filter(is_prime, numbers))

print("Numbers:", numbers)
print("Prime numbers:", prime_numbers)

In [None]:
# Example 2: Filter valid email addresses
def is_valid_email(email):
    """Basic email validation"""
    return '@' in email and '.' in email.split('@')[-1]

emails = [
    'user@example.com',
    'invalid.email',
    'another@domain.org',
    'bad@email',
    'good@mail.co.uk'
]

valid_emails = list(filter(is_valid_email, emails))

print("All emails:", emails)
print("Valid emails:", valid_emails)

In [None]:
# Example 3: Filter students who passed
def has_passed(student):
    """Returns True if student's grade is 60 or above"""
    return student['grade'] >= 60

students = [
    {'name': 'Alice', 'grade': 85},
    {'name': 'Bob', 'grade': 55},
    {'name': 'Charlie', 'grade': 72},
    {'name': 'Diana', 'grade': 45},
    {'name': 'Eve', 'grade': 90}
]

passing_students = list(filter(has_passed, students))

print("All students:")
for student in students:
    print(f"  {student['name']}: {student['grade']}")

print("\nPassing students:")
for student in passing_students:
    print(f"  {student['name']}: {student['grade']}")

## 6. filter() vs List Comprehension <a id='comparison'></a>

Both `filter()` and list comprehensions can filter elements. Here's a comparison:

| Feature | filter() | List Comprehension |
|---------|----------|-------------------|
| Syntax | `filter(func, iter)` | `[x for x in iter if condition]` |
| Returns | Iterator (filter object) | List |
| Memory | More efficient for large datasets | Creates list immediately |
| Readability | Good for simple conditions | Often more readable |
| Flexibility | Limited to filtering only | Can filter AND transform |
| Speed | Slightly faster for pure filtering | Comparable speed |

In [None]:
# Comparison example
numbers = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

# Using filter()
even_filter = list(filter(lambda x: x % 2 == 0, numbers))

# Using list comprehension
even_lc = [x for x in numbers if x % 2 == 0]

print("Using filter():", even_filter)
print("Using list comprehension:", even_lc)
print("Results are equal:", even_filter == even_lc)

In [None]:
# List comprehension can filter AND transform
numbers = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

# Get squares of even numbers - requires both filter and map
squared_evens_filter = list(map(lambda x: x ** 2, filter(lambda x: x % 2 == 0, numbers)))

# Using list comprehension - more concise
squared_evens_lc = [x ** 2 for x in numbers if x % 2 == 0]

print("Using filter() + map():", squared_evens_filter)
print("Using list comprehension:", squared_evens_lc)

In [None]:
# Memory efficiency demonstration
import sys

# Create a larger dataset
large_numbers = range(10000)

# filter() returns an iterator (smaller memory footprint)
filter_result = filter(lambda x: x % 2 == 0, large_numbers)

# List comprehension creates a list immediately
lc_result = [x for x in large_numbers if x % 2 == 0]

print(f"Memory size of filter object: {sys.getsizeof(filter_result)} bytes")
print(f"Memory size of list: {sys.getsizeof(lc_result)} bytes")
print(f"\nList takes {sys.getsizeof(lc_result) / sys.getsizeof(filter_result):.1f}x more memory")

## 7. Combining filter() with map() <a id='combining'></a>

You can combine `filter()` and `map()` to first filter elements and then transform them.

In [None]:
# Example 1: Filter then transform
numbers = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

# Get squares of only even numbers
result = list(map(lambda x: x ** 2, filter(lambda x: x % 2 == 0, numbers)))

print("Original numbers:", numbers)
print("Squares of even numbers:", result)

In [None]:
# Example 2: Filter strings, then uppercase
words = ['apple', 'hi', 'banana', 'cat', 'elephant', 'dog', 'kiwi']

# Get uppercase versions of words longer than 3 characters
result = list(map(str.upper, filter(lambda word: len(word) > 3, words)))

print("Original words:", words)
print("Long words (uppercase):", result)

In [None]:
# Example 3: Filter positive numbers, then calculate their square roots
import math

numbers = [-4, 9, -1, 16, 25, -9, 36, 49, -16, 64]

# Get square roots of positive numbers
result = list(map(math.sqrt, filter(lambda x: x > 0, numbers)))

print("Original numbers:", numbers)
print("Square roots of positive numbers:", result)

## 8. Common Use Cases <a id='use-cases'></a>

Here are some practical, real-world examples of using `filter()`:

### Use Case 1: Data Validation and Cleaning

In [None]:
# Remove empty strings and whitespace-only strings
data = ['John', '', 'Jane', '   ', 'Bob', '\t\n', 'Alice', ' ']

# Clean data - remove empty/whitespace strings
clean_data = list(filter(lambda s: s.strip(), data))

print("Original data:", data)
print("Cleaned data:", clean_data)

In [None]:
# Filter valid ages (between 0 and 120)
ages = [25, -5, 150, 30, 0, 45, 200, 18, 65, -10]

valid_ages = list(filter(lambda age: 0 <= age <= 120, ages))

print("All ages:", ages)
print("Valid ages (0-120):", valid_ages)

### Use Case 2: Filtering Based on User Criteria

In [None]:
# Filter products by price range
products = [
    {'name': 'Laptop', 'price': 1200},
    {'name': 'Mouse', 'price': 25},
    {'name': 'Keyboard', 'price': 75},
    {'name': 'Monitor', 'price': 300},
    {'name': 'USB Cable', 'price': 10}
]

min_price = 50
max_price = 500

# Products in price range
filtered_products = list(filter(
    lambda p: min_price <= p['price'] <= max_price,
    products
))

print(f"Products between ${min_price} and ${max_price}:")
for product in filtered_products:
    print(f"  {product['name']}: ${product['price']}")

In [None]:
# Filter users by age and membership status
users = [
    {'name': 'Alice', 'age': 25, 'is_member': True},
    {'name': 'Bob', 'age': 17, 'is_member': True},
    {'name': 'Charlie', 'age': 30, 'is_member': False},
    {'name': 'Diana', 'age': 22, 'is_member': True}
]

# Adult members only (age >= 18 and is_member = True)
adult_members = list(filter(
    lambda user: user['age'] >= 18 and user['is_member'],
    users
))

print("Adult members:")
for user in adult_members:
    print(f"  {user['name']} (Age: {user['age']})")

### Use Case 3: File and Text Processing

In [None]:
# Filter files by extension
files = [
    'document.pdf',
    'image.jpg',
    'script.py',
    'data.csv',
    'photo.png',
    'notes.txt',
    'program.py'
]

# Get only Python files
python_files = list(filter(lambda f: f.endswith('.py'), files))

print("All files:", files)
print("Python files:", python_files)

In [None]:
# Filter lines containing a keyword
text_lines = [
    'Python is a great language',
    'Java is also popular',
    'Python has many libraries',
    'JavaScript runs in browsers',
    'Python is easy to learn'
]

keyword = 'Python'

# Lines containing the keyword
filtered_lines = list(filter(lambda line: keyword in line, text_lines))

print(f"Lines containing '{keyword}':")
for line in filtered_lines:
    print(f"  - {line}")

### Use Case 4: Numeric Data Analysis

In [None]:
# Filter outliers from a dataset
def is_not_outlier(value, data, threshold=2):
    """Check if value is not an outlier (within threshold standard deviations)"""
    mean = sum(data) / len(data)
    variance = sum((x - mean) ** 2 for x in data) / len(data)
    std_dev = variance ** 0.5
    
    return abs(value - mean) <= threshold * std_dev

data = [10, 12, 11, 13, 12, 100, 11, 14, 12, 13, 10]

# Remove outliers
clean_data = list(filter(lambda x: is_not_outlier(x, data), data))

print("Original data:", data)
print("Data without outliers:", clean_data)
print(f"Average (original): {sum(data) / len(data):.2f}")
print(f"Average (cleaned): {sum(clean_data) / len(clean_data):.2f}")

In [None]:
# Filter grades by letter grade category
grades = [95, 87, 76, 92, 68, 55, 81, 73, 90, 62]

def get_grade_category(score):
    """Returns letter grade for a score"""
    if score >= 90:
        return 'A'
    elif score >= 80:
        return 'B'
    elif score >= 70:
        return 'C'
    elif score >= 60:
        return 'D'
    else:
        return 'F'

# Get A and B grades only
top_grades = list(filter(lambda score: score >= 80, grades))

print("All grades:", grades)
print("A and B grades (80+):", top_grades)
print(f"\nTop performers: {len(top_grades)} out of {len(grades)} students ({len(top_grades)/len(grades)*100:.1f}%)")

### Use Case 5: Filtering Complex Objects

In [None]:
# Filter employees by multiple criteria
employees = [
    {'name': 'Alice', 'department': 'Engineering', 'salary': 80000, 'years': 5},
    {'name': 'Bob', 'department': 'Sales', 'salary': 60000, 'years': 3},
    {'name': 'Charlie', 'department': 'Engineering', 'salary': 95000, 'years': 8},
    {'name': 'Diana', 'department': 'Marketing', 'salary': 70000, 'years': 4},
    {'name': 'Eve', 'department': 'Engineering', 'salary': 75000, 'years': 2}
]

# Senior engineers (Engineering dept, 5+ years, 75k+ salary)
senior_engineers = list(filter(
    lambda emp: emp['department'] == 'Engineering' and 
                emp['years'] >= 5 and 
                emp['salary'] >= 75000,
    employees
))

print("Senior Engineers:")
for emp in senior_engineers:
    print(f"  {emp['name']}: ${emp['salary']:,} ({emp['years']} years)")

## 9. Summary <a id='summary'></a>

### Key Takeaways:

1. **What is filter()?**
   - A built-in Python function that filters elements based on a condition
   - Returns an iterator (filter object), not a list
   - Only includes elements where the test function returns True

2. **Syntax:**
   ```python
   filter(function, iterable)
   ```

3. **Common Uses:**
   - Removing unwanted elements from a collection
   - Data validation and cleaning
   - Selecting elements that meet specific criteria
   - Filtering based on complex conditions

4. **Advantages:**
   - Cleaner, more readable code for filtering
   - Memory efficient for large datasets (returns iterator)
   - Functional programming approach
   - Can be combined with other functions like map()

5. **When to Use filter() vs List Comprehension:**
   - Use `filter()` for pure filtering with existing functions
   - Use list comprehension when you need to filter AND transform
   - Use `filter()` when memory efficiency is important
   - List comprehension is often more readable for simple conditions

6. **Important Notes:**
   - Convert filter object to list using `list()` if you need a list
   - Filter objects can only be iterated once
   - Passing `None` as function removes all falsy values
   - The filtering function must return a boolean (True/False)
   - Works great with lambda for simple conditions

### Best Practices:

- Use `filter()` for clear, straightforward filtering operations
- Combine with `list()`, `tuple()`, or `set()` when you need a concrete collection
- Use descriptive function names for complex filtering logic
- Consider list comprehensions for better readability in simple cases
- Combine with `map()` when you need to filter AND transform
- Remember that filter objects are lazy and only computed when needed

In [None]:
# Final comprehensive example
print("=" * 60)
print("FILTER FUNCTION - COMPREHENSIVE EXAMPLE")
print("=" * 60)

# E-commerce order data
orders = [
    {'id': 1001, 'customer': 'Alice', 'amount': 150, 'status': 'completed', 'priority': True},
    {'id': 1002, 'customer': 'Bob', 'amount': 50, 'status': 'pending', 'priority': False},
    {'id': 1003, 'customer': 'Charlie', 'amount': 250, 'status': 'completed', 'priority': True},
    {'id': 1004, 'customer': 'Diana', 'amount': 75, 'status': 'cancelled', 'priority': False},
    {'id': 1005, 'customer': 'Eve', 'amount': 300, 'status': 'completed', 'priority': True},
]

print("\nOriginal Orders:")
for order in orders:
    print(f"  #{order['id']}: {order['customer']} - ${order['amount']} ({order['status']})")

# Filter 1: High-value completed orders
high_value_completed = list(filter(
    lambda o: o['status'] == 'completed' and o['amount'] >= 200,
    orders
))

print("\n" + "=" * 60)
print("High-Value Completed Orders (>= $200):")
for order in high_value_completed:
    print(f"  #{order['id']}: {order['customer']} - ${order['amount']}")

# Filter 2: Priority orders
priority_orders = list(filter(lambda o: o['priority'], orders))

print("\n" + "=" * 60)
print("Priority Orders:")
for order in priority_orders:
    print(f"  #{order['id']}: {order['customer']} - ${order['amount']}")

# Calculate total revenue from completed orders
completed_orders = list(filter(lambda o: o['status'] == 'completed', orders))
total_revenue = sum(order['amount'] for order in completed_orders)

print("\n" + "=" * 60)
print(f"Total Revenue (Completed Orders): ${total_revenue}")
print(f"Completed Orders: {len(completed_orders)} out of {len(orders)}")
print("=" * 60)