# CS1350 Week 4 Homework: NumPy Array Operations

**Scientific Computing with NumPy**

- **Due Date:** Monday, Week 6 (beginning of class)
- **Expected Time:** 2-3 hours
- **Points:** 100
- **Submission:** Submit a single Python file `homework2_lastname_firstname.py`

## Learning Objectives

Upon completing this assignment, you will be able to:
- Create and manipulate NumPy arrays
- Perform mathematical operations on arrays
- Use array indexing and slicing effectively
- Apply broadcasting principles
- Implement basic statistical analysis with NumPy
- Compare performance between NumPy arrays and Python lists

---

## 📝 Conversion Disclaimer

**This notebook was converted from the original PDF assignment using Claude AI assistance.**

### Changes Made During Conversion:
- **Format conversion**: PDF → Jupyter notebook with interactive cells
- **Enhanced testing**: Added individual test cells after each problem for immediate feedback
- **Fixed numbering**: Corrected "Problem 6" to "Problem 5" (performance comparison) to match logical sequence
- **Added error handling**: Included try-catch blocks in comprehensive testing section
- **Improved formatting**: Enhanced markdown structure, tables, and visual organization

### What Remains Unchanged:
✅ All function signatures and variable names  
✅ All TODO comments and requirements  
✅ All expected outputs and examples  
✅ All problem descriptions and point values  
✅ All academic guidelines and submission requirements  

**Note:** This conversion was done for educational convenience only. None of the codeblocks with TODO sections should be edited by Claude AI

---

## Setup Instructions

Run the following cell to import required libraries and set the random seed:

In [12]:
# Required import for all problems
import numpy as np
import time

# Set random seed for reproducible results
np.random.seed(1350)  # Use course number as seed

## Problem 1: Array Creation and Basic Operations (20 points)

Create arrays using different NumPy methods and perform basic operations.

**Expected outputs:**
- `arr1`: [10 15 20 25 30 35 40 45 50]
- `arr2`: 3x4 array of zeros
- `identity`: 3x3 identity matrix
- `linspace_arr`: 10 evenly spaced values from 0 to 5
- `random_arr`: 2x5 array with random values

In [13]:
def problem1():
    """
    Complete the following tasks:
    """
    # a) Create a 1D array of integers from 10 to 50 (inclusive) with step 5
    # Store in variable 'arr1'
    arr1 = np.arange(10,51)
    
    # b) Create a 2D array of shape (3, 4) filled with zeros
    # Store in variable 'arr2'
    arr2 = np.zeros((3,4))
    
    # c) Create a 3x3 identity matrix
    # Store in variable 'identity'
    identity = np.identity(3)
    
    # d) Create an array of 10 evenly spaced numbers between 0 and 5
    # Store in variable 'linspace_arr'
    linspace_arr = np.linspace(0, 5, 10)
    
    # e) Create a random array of shape (2, 5) with values between 0 and 1
    # Store in variable 'random_arr'
    random_arr = np.random.rand(2,5)
    
    return arr1, arr2, identity, linspace_arr, random_arr

In [14]:
# Test Problem 1
print("Problem 1 Results:")
results = problem1()
for i, result in enumerate(results, 1):
    print(f"Result {i}:")
    print(result)
    print()

Problem 1 Results:
Result 1:
[10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33
 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50]

Result 2:
[[0. 0. 0. 0.]
 [0. 0. 0. 0.]
 [0. 0. 0. 0.]]

Result 3:
[[1. 0. 0.]
 [0. 1. 0.]
 [0. 0. 1.]]

Result 4:
[0.         0.55555556 1.11111111 1.66666667 2.22222222 2.77777778
 3.33333333 3.88888889 4.44444444 5.        ]

Result 5:
[[0.20677793 0.48100137 0.44414215 0.10243488 0.77572263]
 [0.11940061 0.26187248 0.7060147  0.94824814 0.71155267]]



## Problem 2: Array Mathematics and Broadcasting (20 points)

Perform mathematical operations demonstrating NumPy's broadcasting capabilities.

**Expected output for result_add:**
```
[[11 22 33]
 [14 25 36]
 [17 28 39]]
```

In [15]:
def problem2():
    """
    Perform array operations using broadcasting.
    """
    # Given arrays
    arr_a = np.array([[1, 2, 3],
                      [4, 5, 6],
                      [7, 8, 9]])
    arr_b = np.array([10, 20, 30])
    
    # a) Add arr_b to each row of arr_a (using broadcasting)
    result_add = arr_a + arr_b
    
    # b) Multiply each column of arr_a by the corresponding element in arr_b
    result_multiply = arr_a * arr_b
    
    # c) Calculate the square of all elements in arr_a
    result_square = arr_a ** 2
    
    # d) Calculate the mean of each column in arr_a
    column_means = np.mean(arr_a, axis=0)
    
    # e) Subtract the column means from each element in the respective column
    # This is called "centering" the data
    centered_arr = arr_a - column_means
    
    return result_add, result_multiply, result_square, column_means, centered_arr

In [16]:
# Test Problem 2
print("Problem 2 Results:")
results = problem2()
labels = ['Addition', 'Multiplication', 'Square', 'Column Means', 'Centered Array']
for label, result in zip(labels, results):
    print(f"{label}:")
    print(result)
    print()

Problem 2 Results:
Addition:
[[11 22 33]
 [14 25 36]
 [17 28 39]]

Multiplication:
[[ 10  40  90]
 [ 40 100 180]
 [ 70 160 270]]

Square:
[[ 1  4  9]
 [16 25 36]
 [49 64 81]]

Column Means:
[4. 5. 6.]

Centered Array:
[[-3. -3. -3.]
 [ 0.  0.  0.]
 [ 3.  3.  3.]]



## Problem 3: Array Indexing and Slicing (25 points)

Practice advanced indexing and slicing techniques.

**Original array:**
```
[[ 1  2  3  4  5]
 [ 6  7  8  9 10]
 [11 12 13 14 15]
 [16 17 18 19 20]
 [21 22 23 24 25]]
```

In [27]:
def problem3():
    """
    Demonstrate array indexing and slicing.
    """
    # Create a 5x5 array with values from 1 to 25
    arr = np.arange(1, 26).reshape(5, 5)
    print("Original array:")
    print(arr)
    print()
    
    # a) Extract the third row
    third_row = arr[2]
    
    # b) Extract the last column
    last_column = arr[:,-1]
    
    # c) Extract the 2x2 subarray from the center (rows 1-2, columns 1-2)
    center_subarray = arr[1:3,1:3]
    
    # d) Extract all elements greater than 15
    greater_than_15 = arr[arr > 15]
    
    # e) Replace all even numbers with -1 (create a copy first)
    arr_copy = arr.copy()
    arr_copy[arr_copy % 2 == 0] = -1
    
    return third_row, last_column, center_subarray, greater_than_15, arr_copy

In [28]:
# Test Problem 3
print("Problem 3 Results:")
results = problem3()
labels = ['Third Row', 'Last Column', 'Center Subarray', 'Greater than 15', 'Even Numbers Replaced']
for label, result in zip(labels, results):
    print(f"{label}:")
    print(result)
    print()

Problem 3 Results:
Original array:
[[ 1  2  3  4  5]
 [ 6  7  8  9 10]
 [11 12 13 14 15]
 [16 17 18 19 20]
 [21 22 23 24 25]]

Third Row:
[11 12 13 14 15]

Last Column:
[ 5 10 15 20 25]

Center Subarray:
[[ 7  8]
 [12 13]]

Greater than 15:
[16 17 18 19 20 21 22 23 24 25]

Even Numbers Replaced:
[[ 1 -1  3 -1  5]
 [-1  7 -1  9 -1]
 [11 -1 13 -1 15]
 [-1 17 -1 19 -1]
 [21 -1 23 -1 25]]



## Problem 4: Statistical Analysis (25 points)

Use NumPy's statistical functions to analyze data.

**Expected outputs:**
- `student_averages`: [86.25, 85.75, 91.0, 76.25, 90.0]
- `test_averages`: [83.8, 85.2, 85.6, 88.8]

In [29]:
def problem4():
    """
    Perform statistical analysis on student scores.
    """
    # Student test scores (rows: students, columns: tests)
    scores = np.array([[85, 90, 78, 92],
                       [79, 85, 88, 91],
                       [92, 88, 95, 89],
                       [75, 72, 80, 78],
                       [88, 91, 87, 94]])
    
    print("Student Scores:")
    print(scores)
    print()
    
    # a) Calculate the average score for each student (across all tests)
    student_averages = np.mean(scores, axis=1)
    
    # b) Calculate the average score for each test (across all students)
    test_averages = np.mean(scores, axis=0)
    
    # c) Find the highest score for each student
    student_max_scores = np.max(scores, axis=1)
    
    # d) Find the standard deviation of scores for each test
    test_std = np.std(scores, axis=0)
    
    # e) Identify which students have an average score above 85
    # Return a boolean array
    high_performers = student_averages > 85
    
    return student_averages, test_averages, student_max_scores, test_std, high_performers

In [30]:
# Test Problem 4
print("Problem 4 Results:")
results = problem4()
labels = ['Student Averages', 'Test Averages', 'Student Max Scores', 'Test Std Dev', 'High Performers']
for label, result in zip(labels, results):
    print(f"{label}:")
    print(result)
    print()

Problem 4 Results:
Student Scores:
[[85 90 78 92]
 [79 85 88 91]
 [92 88 95 89]
 [75 72 80 78]
 [88 91 87 94]]

Student Averages:
[86.25 85.75 91.   76.25 90.  ]

Test Averages:
[83.8 85.2 85.6 88.8]

Student Max Scores:
[92 91 95 80 94]

Test Std Dev:
[6.11228272 6.91086102 6.08604962 5.63560112]

High Performers:
[ True  True  True False  True]



## Problem 5: Performance Comparison (10 points)

Compare the performance of NumPy arrays versus Python lists.

**Expected:** NumPy should be significantly faster (10x or more)

In [56]:
def problem5():
    """
    Compare performance between NumPy arrays and Python lists.
    Complete the timing comparisons.
    """
    size = 1000000 # upped size to have numpy time consistently >0
    
    # Create Python list and NumPy array with same data
    python_list = list(range(size))
    numpy_array = np.arange(size)
    
    # Task: Square all elements
    
    # Python list approach
    start_time = time.time()
    # TODO: Square all elements in python_list using list comprehension
    list_result = [x**2 for x in python_list]
    list_time = time.time() - start_time
    
    # NumPy array approach
    start_time = time.time()
    # TODO: Square all elements in numpy_array using NumPy operations
    array_result = numpy_array ** 2
    numpy_time = time.time() - start_time
    
    # Calculate speedup
    speedup = (list_time / numpy_time) if numpy_time > 0 else 0
    
    # Return times and speedup factor
    return {
        'list_time': list_time,
        'numpy_time': numpy_time,
        'speedup': speedup,
        'conclusion': f"NumPy is {speedup:.1f}x faster than Python lists for this operation"
    }

In [61]:
# Test Problem 5
print("Problem 5 Results:")
result = problem5()
for key, value in result.items():
    print(f"{key}: {value}")

Problem 5 Results:
list_time: 0.3157773017883301
numpy_time: 0.01135110855102539
speedup: 27.819071623608487
conclusion: NumPy is 27.8x faster than Python lists for this operation


## Bonus Challenge (Optional, +25 points)

Create a function that implements a simple image processing operation using NumPy.

In [69]:
def bonus_challenge():
    """
    Create a simple 10x10 'image' and apply transformations.
    """
    # Create a 10x10 array representing a grayscale image
    # Values should be between 0 (black) and 255 (white)
    image = np.random.randint(0, 256, size=(10, 10))
    print("Original Image:")
    print(image)
    print()
    
    # a) Normalize the image (scale values to 0-1 range)
    normalized = image / 255
    
    # b) Apply brightness adjustment (increase all values by 50, cap at 255)
    brightened = np.clip(image + 50, 0, 255)
    
    # c) Create a negative of the image (invert values)
    negative = 255 - image
    
    # d) Apply threshold (values > 128 become 255, others become 0)
    thresholded = (image > 128) * 255
    
    return normalized, brightened, negative, thresholded

In [70]:
# Test Bonus Challenge (uncomment to run)
print("Bonus Challenge Results:")
results = bonus_challenge()
labels = ['Normalized', 'Brightened', 'Negative', 'Thresholded']
for label, result in zip(labels, results):
    print(f"{label}:")
    print(result)
    print()

Bonus Challenge Results:
Original Image:
[[ 66  81 191  20 100  19  90 145 232 180]
 [209  75  97 244   9 120 176  90  30   0]
 [ 70 170  21 239   1  12 171  13  65  76]
 [ 39 168  66 197 234 141  37 131 166  14]
 [ 62 255 167 102 223 237 204 116 144  10]
 [132  77   8 233 241 109 241 142 214   9]
 [213  79  54 179  79 181   4 203 191  22]
 [227 180 219 209 150  74  74 199 146 248]
 [247 173  83 162 179  57 120 193 187 101]
 [203 149 139 136  32 138 167  84  77 235]]

Normalized:
[[0.25882353 0.31764706 0.74901961 0.07843137 0.39215686 0.0745098
  0.35294118 0.56862745 0.90980392 0.70588235]
 [0.81960784 0.29411765 0.38039216 0.95686275 0.03529412 0.47058824
  0.69019608 0.35294118 0.11764706 0.        ]
 [0.2745098  0.66666667 0.08235294 0.9372549  0.00392157 0.04705882
  0.67058824 0.05098039 0.25490196 0.29803922]
 [0.15294118 0.65882353 0.25882353 0.77254902 0.91764706 0.55294118
  0.14509804 0.51372549 0.65098039 0.05490196]
 [0.24313725 1.         0.65490196 0.4        0.8745098 

## Test All Problems

Run this cell to test all your solutions at once:

In [None]:
# Test all problems
if __name__ == "__main__":
    print("=" * 50)
    print("TESTING ALL PROBLEMS")
    print("=" * 50)
    
    try:
        print("\nProblem 1 Results:")
        print(problem1())
    except Exception as e:
        print(f"Problem 1 Error: {e}")
    
    try:
        print("\nProblem 2 Results:")
        print(problem2())
    except Exception as e:
        print(f"Problem 2 Error: {e}")
    
    try:
        print("\nProblem 3 Results:")
        print(problem3())
    except Exception as e:
        print(f"Problem 3 Error: {e}")
    
    try:
        print("\nProblem 4 Results:")
        print(problem4())
    except Exception as e:
        print(f"Problem 4 Error: {e}")
    
    try:
        print("\nProblem 5 Results:")
        print(problem5())
    except Exception as e:
        print(f"Problem 5 Error: {e}")
    
    # Uncomment if attempting bonus
    # try:
    #     print("\nBonus Challenge Results:")
    #     print(bonus_challenge())
    # except Exception as e:
    #     print(f"Bonus Challenge Error: {e}")

## Grading Rubric

| Problem | Points | Criteria |
|---------|--------|----------|
| Problem 1 | 20 | Correct array creation using appropriate NumPy functions |
| Problem 2 | 20 | Proper use of broadcasting and array operations |
| Problem 3 | 25 | Correct indexing, slicing, and boolean indexing |
| Problem 4 | 25 | Accurate statistical calculations using NumPy functions |
| Problem 5 | 10 | Working performance comparison with meaningful results |
| **Total** | **100** | |
| Bonus | +25 | Creative and correct image processing implementation |

### Partial Credit Policy:
- Partially correct solutions will receive proportional credit
- Clear comments explaining approach can earn partial credit even if solution is incomplete

## Resources and Hints

### Useful NumPy Functions:
- **Array creation:** `np.array()`, `np.zeros()`, `np.ones()`, `np.arange()`, `np.linspace()`
- **Array operations:** `np.mean()`, `np.std()`, `np.max()`, `np.min()`
- **Reshaping:** `reshape()`, `flatten()`, `ravel()`, `transpose()`
- **Stacking:** `np.vstack()`, `np.hstack()`, `np.concatenate()`

### Common Mistakes to Avoid:
1. Forgetting to use `copy()` when you need to preserve the original array
2. Not understanding broadcasting rules (compatible dimensions)
3. Using Python loops instead of vectorized NumPy operations
4. Confusing row-major vs column-major ordering
5. Not setting random seed for reproducible results

### Getting Help:
- Review Week 4 lecture notes and lab exercises
- NumPy documentation: https://numpy.org/doc/stable/
- Office hours: See Syllabus

## Academic Integrity Reminder

This is an individual assignment. You may discuss concepts with classmates, but all code must be your own work. Using AI tools for code generation is not permitted and will be considered academic dishonesty. Cite any resources you use beyond course materials.

**Good luck!** Remember: NumPy makes array operations fast and elegant. Think in terms of whole arrays, not individual elements.