Let's create a **real-world example** in the context of **image processing** that demonstrates the superiority of **NumPy arrays** over Python built-in lists. In this example, we'll perform a **grayscale conversion** of an image and compute the **average brightness**. These are common tasks in image processing, and the use of NumPy arrays will clearly showcase the advantages in terms of **performance and simplicity**.

### Problem Statement:

You are given a color image, and your task is to:
1. **Convert the image to grayscale** by averaging the RGB values of each pixel.
2. **Compute the average brightness** of the grayscale image.

We'll use Python lists and NumPy arrays to perform these operations and compare their performance.

---

### Step 1: Using Python Lists

First, let's implement the task using **Python lists**. Assume the image is represented as a 3D list where each pixel is a list of [R, G, B] values.

#### Code for Python Lists:

```python
import random
import time

# Simulate a 500x500 pixel image with random RGB values (using Python lists)
image_height = 500
image_width = 500
image = [[[random.randint(0, 255) for _ in range(3)] for _ in range(image_width)] for _ in range(image_height)]

# Function to convert an RGB image to grayscale using Python lists
def rgb_to_grayscale_list(image):
    grayscale_image = []
    for row in image:
        grayscale_row = []
        for pixel in row:
            # Calculate the average of the RGB values to get the grayscale value
            grayscale_value = sum(pixel) // 3
            grayscale_row.append(grayscale_value)
        grayscale_image.append(grayscale_row)
    return grayscale_image

# Function to calculate the average brightness of a grayscale image using Python lists
def average_brightness_list(grayscale_image):
    total_brightness = 0
    num_pixels = 0
    for row in grayscale_image:
        for value in row:
            total_brightness += value
            num_pixels += 1
    return total_brightness / num_pixels

# Measure the time for Python list operations
start_time = time.time()

# Convert to grayscale
grayscale_image_list = rgb_to_grayscale_list(image)

# Calculate average brightness
avg_brightness_list = average_brightness_list(grayscale_image_list)

end_time = time.time()
print(f"Average Brightness (Python lists): {avg_brightness_list}")
print(f"Time taken using Python lists: {end_time - start_time} seconds")
```

---

### Step 2: Using NumPy Arrays

Now, we'll perform the same operations using **NumPy arrays**.

#### Code for NumPy Arrays:

```python
import numpy as np
import time

# Simulate a 500x500 pixel image with random RGB values (using NumPy arrays)
image_np = np.random.randint(0, 256, (500, 500, 3), dtype=np.uint8)

# Measure the time for NumPy operations
start_time = time.time()

# Convert to grayscale using NumPy's vectorized operations
grayscale_image_np = np.mean(image_np, axis=2, dtype=np.uint8)

# Calculate the average brightness using NumPy's built-in mean function
avg_brightness_np = np.mean(grayscale_image_np)

end_time = time.time()
print(f"Average Brightness (NumPy): {avg_brightness_np}")
print(f"Time taken using NumPy arrays: {end_time - start_time} seconds")
```

---

### Explanation of the Code:

- **Data Simulation**:
    - We simulate a **500x500 pixel image** with random RGB values (ranging from 0 to 255).
    - In the Python list version, we create a **nested list** structure to store the RGB values.
    - In the NumPy version, we use `np.random.randint` to generate a **3D NumPy array** where each pixel is a 1D array of 3 values (RGB).

- **Grayscale Conversion**:
    - In the list version, we loop over each pixel and compute the grayscale value by averaging the RGB values using a `for` loop.
    - In the NumPy version, we use `np.mean(image_np, axis=2)` to compute the grayscale values **across the third dimension (RGB)** using **vectorized operations**.

- **Average Brightness**:
    - In the list version, we manually loop through all the pixels and sum the grayscale values, then divide by the total number of pixels.
    - In the NumPy version, we simply call `np.mean(grayscale_image_np)` to compute the average brightness in one line of code, leveraging NumPy's optimized array operations.

---

### Expected Output:

```
Average Brightness (Python lists): X.XX
Time taken using Python lists: Y.YY seconds

Average Brightness (NumPy): X.XX
Time taken using NumPy arrays: Z.ZZ seconds
```

### Analysis:

1. **Performance**:
   - The NumPy version will be **much faster** than the Python list version due to **vectorized operations**. NumPy performs these operations in **compiled C code**, which is far more efficient than Python's interpreted loops.
   - Python's `for` loops have significant overhead, especially when iterating over large datasets like images. NumPy eliminates this overhead by performing operations in a **single step**.

2. **Memory Efficiency**:
   - Python lists store references to individual Python objects, which incurs additional memory overhead. In contrast, NumPy arrays store data in **contiguous blocks of memory**, making them more memory-efficient for large datasets like images.

3. **Code Simplicity**:
   - In the list version, we have to write **explicit loops** to iterate through each pixel, while NumPy allows us to use **vectorized operations** with minimal code.
   - For example, computing the **grayscale conversion** and **average brightness** with NumPy requires just **two lines of code**, compared to multiple loops in Python lists.

---

### Conclusion:

In this **real-world image processing example**, we demonstrated how **NumPy arrays** offer superior **performance**, **memory efficiency**, and **ease of use** compared to Python's built-in lists. When dealing with large datasets such as images, **NumPy's vectorized operations** allow us to process data quickly and efficiently, without the need for explicit loops.

This example clearly illustrates why NumPy is the preferred choice for tasks involving numerical data and large arrays, making it ideal for image processing and other data-intensive operations.

In [6]:
import random
import time

# Simulate a 500x500 pixel image with random RGB values (using Python lists)
image_height = 5000
image_width = 5000

# Measure the time for Python list operations
start_time = time.time()

image = [[[random.randint(0, 255) for _ in range(3)] for _ in range(image_width)] for _ in range(image_height)]

# Function to convert an RGB image to grayscale using Python lists
def rgb_to_grayscale_list(image):
    grayscale_image = []
    for row in image:
        grayscale_row = []
        for pixel in row:
            # Calculate the average of the RGB values to get the grayscale value
            grayscale_value = sum(pixel) // 3
            grayscale_row.append(grayscale_value)
        grayscale_image.append(grayscale_row)
    return grayscale_image

# Function to calculate the average brightness of a grayscale image using Python lists
def average_brightness_list(grayscale_image):
    total_brightness = 0
    num_pixels = 0
    for row in grayscale_image:
        for value in row:
            total_brightness += value
            num_pixels += 1
    return total_brightness / num_pixels

# Convert to grayscale
grayscale_image_list = rgb_to_grayscale_list(image)

# Calculate average brightness
avg_brightness_list = average_brightness_list(grayscale_image_list)

end_time = time.time()
print(f"Average Brightness (Python lists): {avg_brightness_list}")
print(f"Time taken using Python lists: {end_time - start_time} seconds")


Average Brightness (Python lists): 127.175232
Time taken using Python lists: 32.24377679824829 seconds


In [1]:
import numpy as np
import time

# Measure the time for NumPy operations
start_time = time.time()

# Simulate a 5000x5000 pixel image with random RGB values (using NumPy arrays)
image_np = np.random.randint(0, 256, (5000, 5000, 3), dtype=np.uint8)



# Convert to grayscale using NumPy's vectorized operations
grayscale_image_np = np.mean(image_np, axis=2, dtype=np.uint8)

# Calculate the average brightness using NumPy's built-in mean function
avg_brightness_np = np.mean(grayscale_image_np)

end_time = time.time()
print(f"Average Brightness (NumPy): {avg_brightness_np}")
print(f"Time taken using NumPy arrays: {end_time - start_time} seconds")


Average Brightness (NumPy): 42.17392492
Time taken using NumPy arrays: 0.3482694625854492 seconds


In [2]:
import numpy as np
from PIL import Image

# Create a blank image with 5000x5000 resolution (RGB)
#image = np.zeros((5000, 5000, 3), dtype=np.uint8)  # All black image
image = np.random.randint(0, 256, (5000, 5000, 3), dtype=np.uint8)

# Modify some pixels (e.g., set the top-left corner to red)
image[:4999, :100] = [255, 0, 0]  # Set 100x100 pixel block to red
image[:100, :4999] = [255, 0, 0]  # Set 100x100 pixel block to red

# Convert NumPy array to an image using PIL
img = Image.fromarray(image)

# Save the image
img.save("5000x5000_image.png")

# Display the image
img.show()
