## Exercise  07:  NumPy practice

The objective of this exercise is to practice your NumPy skills.

In [64]:
import numpy as np
import math

### Counting zeros

For a 1-d array $x$, we'll define its `number_of_zeros` as the number of elements in the array that are equal to zero.
For example, for the array 

```Python
[1, 5, 0, 6, 0, 1]
```

The `number_of_zeros` is equal to 2.

We can apply `number_of_zeros` to a matrix $X$ (i.e. a 2-d array).
The definition can be applied either to the columns or rows of the matrix, resulting in an array of `number_of_zeros` values for each column/row.  Your task is to write a function that computes `number_of_zeros` for a 2-d array.  You may not use the NumPy functions count_nonzero, nonzero, and argwhere.

For example, for the matrix
```Python
2 0 3 0
0 0 1 5
0 0 0 6
```

when applied to the columns, the result should be an array that contains the numbers

```Python
2 3 1 1
```

when applied to the rows the result should be an array that contains the numbers

```Python
2 2 3
```

Fill in the following function for computing `number_of_zeros`.  The axis  argument should control whether the operation is performed on columns or rows.

In addition to writing the function, write code that tests its correctness, i.e. compares its output to a result you know is correct, returning True/False on whether it matches that correct output.

In [74]:
import numpy as np
import math

In [75]:
def number_of_zeros(X, axis=0):
    num_rows = len(X)
    num_cols = len(X[0])

    if axis == 0:
        zeros_count = [0] * num_cols
        for col in range(num_cols):
            for row in range(num_rows):
                if X[row][col] == 0:
                    zeros_count[col] += 1
        return zeros_count

    elif axis == 1:
        zeros_count = [0] * num_rows
        for row in range(num_rows):
            for val in X[row]:
                if val == 0:
                    zeros_count[row] += 1
        return zeros_count

    else:
        raise ValueError("Invalid axis value.")

In [76]:
# test your code here
# your testing should verify that the code works correctly, i.e.
# will return a True/False on whether it matches a result you know
# is correct
matrix1 = [
    [2, 0, 3, 0],
    [0, 0, 1, 5],
    [0, 0, 0, 6]
]
result1 = number_of_zeros(matrix1, axis=0)
correct_result1 = [2, 3, 1, 1]
print("Test case for columns:", result1 == correct_result1)

matrix2 = [
    [2, 0, 3, 0],
    [0, 0, 1, 5],
    [0, 0, 0, 6]
]
result2 = number_of_zeros(matrix2, axis=1)
correct_result2 = [2, 2, 3]
print("Test case for rows:", result2 == correct_result2)

Test case for columns: True
Test case for rows: True


### Removing sparse columns

Write a function that removes sparse columns from a 2-d array.
We will define a sparse column as a column that contains mostly zeros, and more specifically, the number of zeros is at least 90% of the entries in the column.  For example, if we apply this to the matrix

```Python
2 0 3 0
0 0 1 5
0 0 0 6
```

The second column would be removed.
You can use the `number_of_zeros` function you just wrote to help you in this task.

As in the previous problem, you also need to write code to test whether your function works correctly by comparing its output to a case where you know the correct solution.

In [77]:
def remove_sparse_columns(X):
    num_rows = len(X)
    
    zeros_per_column = number_of_zeros(X, axis=0)

    keep_columns = []
    for col, num_zeros in enumerate(zeros_per_column):
        if num_zeros / num_rows < 0.9:
            keep_columns.append(col)

    new_matrix = []
    for row in range(num_rows):
        new_row = []
        for col in keep_columns:
            new_row.append(X[row][col])
        new_matrix.append(new_row)

    return new_matrix

In [78]:
# test your code here
# your testing should verify that the code works correctly, i.e.
# will return a True/False on whether it matches a result you know
# is correct
correct_result1 = [
    [2, 3, 0],
    [0, 1, 5],
    [0, 0, 6]
]

matrix1 = [
    [2, 0, 3, 0],
    [0, 0, 1, 5],
    [0, 0, 0, 6]
]
result1 = remove_sparse_columns(matrix1)

print("Test case 1 with a sparse column:", result1 == correct_result1)

matrix2 = [
    [2, 1, 3, 0],
    [0, 0, 1, 5],
    [0, 0, 0, 6]
]
result2 = remove_sparse_columns(matrix2)
correct_result2 = matrix2
print("Test case 2 without a sparse column:", result2 == correct_result2)

Test case 1 with a sparse column: True
Test case 2 without a sparse column: True


### Replacing NaN's with zeros

You are given a feature matrix that has some NaN values.  Write a function that creates a new matrix in which all the NaN values are replaced with zeros.


In [79]:
# your code here
def replace_nans_with_zeros(X):
    num_rows = len(X)
    num_cols = len(X[0])

    new_matrix = []

    for row in range(num_rows):
        new_row = []
        for col in range(num_cols):
            if isinstance(X[row][col], float) and math.isnan(X[row][col]):
                new_row.append(0)
            else:
                new_row.append(X[row][col])
        new_matrix.append(new_row)

    return new_matrix

In [80]:
# write code that verifies that there are no NaN values in the matrix
# returned by your function
matrix1 = [
    [2, 1, math.nan, 4],
    [5, math.nan, 7, 8],
    [9, 10, 11, 12]
]
result1 = replace_nans_with_zeros(matrix1)
correct_result1 = [
    [2, 1, 0, 4],
    [5, 0, 7, 8],
    [9, 10, 11, 12]
]
print("Test case 1 with NaN values:", result1 == correct_result1)

# Test case 2: No NaN values to replace
matrix2 = [
    [2, 1, 3, 4],
    [5, 6, 7, 8],
    [9, 10, 11, 12]
]
result2 = replace_nans_with_zeros(matrix2)
correct_result2 = matrix2
print("Test case 2 with no NaN values:", result2 == correct_result2)

Test case 1 with NaN values: True
Test case 2 with no NaN values: True
