## Exercise  09:  NumPy practice

The objective of this exercise is to practice your NumPy skills.

In [3]:
import numpy as np

### Counting zeros

For a 1-d array $x$, we'll define its `number_of_zeros` as the number of elements in the array that are equal to zero.
For example, for the array 

```Python
[1, 5, 0, 6, 0, 1]
```

The `number_of_zeros` is equal to 2.

We can apply `number_of_zeros` to a matrix $X$ (i.e. a 2-d array).
The definition can be applied either to the columns or rows of the matrix, resulting in an array of `number_of_zeros` values for each column/row.  Your task is to write a function that computes `number_of_zeros` for a 2-d array.  You may not use the NumPy functions count_nonzero, nonzero, and argwhere.

For example, for the matrix
```Python
2 0 3 0
0 0 1 5
0 0 0 6
```

when applied to the columns, the result should be an array that contains the numbers

```Python
2 3 1 1
```

when applied to the rows the result should be an array that contains the numbers

```Python
2 2 3
```

Fill in the following function for computing `number_of_zeros`.  The axis  argument should control whether the operation is performed on columns or rows.

In addition to writing the function, write code that tests its correctness, i.e. compares its output to a result you know is correct, returning True/False on whether it matches that correct output.

In [4]:
def number_of_zeros(X, axis=0):
    return np.sum(X == 0, axis=axis)

In [5]:
# test your code here
# your testing should verify that the code works correctly, i.e.
# will return a True/False on whether it matches a result you know
# is correct
X = np.array([[2, 0, 3, 0],
              [0, 0, 1, 5],
              [0, 0, 0, 6]])
print(number_of_zeros(X, axis=0)) 
print(number_of_zeros(X, axis=1)) 

[2 3 1 1]
[2 2 3]


### Removing sparse columns

Write a function that removes sparse columns from a 2-d array.
We will define a sparse column as a column that contains mostly zeros, and more specifically, the number of zeros is at least 90% of the entries in the column.  For example, if we apply this to the matrix

```Python
2 0 3 0
0 0 1 5
0 0 0 6
```

The second column would be removed.
You can use the `number_of_zeros` function you just wrote to help you in this task.

As in the previous problem, you also need to write code to test whether your function works correctly by comparing its output to a case where you know the correct solution.

In [8]:
def remove_sparse_columns(X):
    rows, cols = X.shape
    num_zeros = number_of_zeros(X, axis=0)
    zero_ratio = num_zeros / rows
    cols_to_keep = zero_ratio <= 0.9
    return X[:, cols_to_keep]

In [9]:
# test your code here
# your testing should verify that the code works correctly, i.e.
# will return a True/False on whether it matches a result you know
# is correct
X = np.array([[2, 0, 3, 0],
              [0, 0, 1, 5],
              [0, 0, 0, 6]])
X = remove_sparse_columns(X)
X 

array([[2, 3, 0],
       [0, 1, 5],
       [0, 0, 6]])

### Replacing NaN's with zeros

You are given a feature matrix that has some NaN values.  Write a function that creates a new matrix in which all the NaN values are replaced with zeros.


In [10]:
# your code here
def replace_nan_with_zero(X):
  X[np.isnan(X)] = 0
  return X

In [11]:
# write code that verifies that there are no NaN values in the matrix
# returned by your function
X = np.array([[2, np.nan, 3, np.nan],
              [np.nan, np.nan, 1, 5],
              [np.nan, np.nan, np.nan, 6]])
X = replace_nan_with_zero(X)
X 
 

array([[2., 0., 3., 0.],
       [0., 0., 1., 5.],
       [0., 0., 0., 6.]])