## NumPy Vectorization and Broadcasting

NumPy is one of the core Python packages used in Data Science and Machine Learning. NumPy is short for Numerical Python. The key data structure that is provided by the NumPy package are arrays, called NumPy arrays. These are similar to Python lists. The NumPy package provides a number of different functions for both creating arrays of different sizes, shapes, and types, and for performing various operations on array elements.

In this exercise you will become familiar with NumPy arrays and key NumPy techniques called vectorization and broadcasting.

<b>Broadcasting</b>: Arrays with different sizes cannot be added, subtracted, or generally be used in arithmetic. A way to overcome this is to duplicate the smaller array so that it is the same dimensionality and size as the larger array. This is called array broadcasting and is available in NumPy when performing array arithmetic. This can greatly reduce and simplify your code.

<b>Vectorization</b>: 
NumPy provides functions that can perform mathematical operations on arrays that would otherwise be performed through the use of for loops. This is a process that is referred to as vectorization. Essentially, instead of using loops to perform certain computations, you can use an available NumPy function. Therefore, if there is a NumPy function available to fit your needs, always use it so as to get much faster performance!

For more information about the NumPy package and NumPy arrays, consult the NumPy online [API Reference](https://numpy.org/doc/stable/reference/index.html).


## Step 1 


Let’s start by importing the `numpy` package, using its conventional shorthand name, `np`. By "conventional shorthand names," we mean that it is an agreed-upon convention by the community to use these particular shorthand names, but there is nothing that would stop you from choosing your own shorthand name, or not using a shorthand name at all. In this program, we will use `np` as the shorthand name for the `numpy` package, and we will import the `numpy` package in as follows:  `import numpy as np`.

Run the code cell below.

<b>Reminder</b>: Prolonged inactivity or navigating to a different page will cause the notebook session to timeout. To resume your work where you left off, be sure to re-run all of the notebook cells in order, starting again at the very beginning with the code cell below.



In [None]:
import numpy as np

## Step 2

Using the `numpy` function `np.arange()`, define an array of integers from 0-999 and assign it to variable `a`. Create a new variable named `b` and assign it the value of 10.

Note: Consult the numpy `arange()` function [documentation](https://numpy.org/doc/stable/reference/generated/numpy.arange.html) to see how to use this function.

### Graded Cell

The cell below will be graded. Remove the line "raise NotImplementedError()" before writing your code.

In [None]:
a = np.arange(0,1000)
b = 10

### Self-Check

Run the cell below to test the correctness of your code above before submitting for grading. Do not add code or delete code in the cell.

In [None]:
# Run this self-test cell to check your code; 
# do not add code or delete code in this cell
from jn import testAB

try:
    p, err = testAB(a,b)
    print(err)
except Exception as e:
    print("Error!\n" + str(e))
    


## Step 3

You will write a function that adds a numerical constant value to every element of a numpy array.

Complete the function called `array_add_constant` that does the following:

1. Takes in two arguments: (1) a `numpy` array of any size, (2) a numeric constant
2. Defines an empty Python list (not a `numpy` array) named `new_list`.
3. Uses a for loop to loop over the elements of the input `numpy` array and does the following for each element:
    * add the numeric constant to the element
    * appends the result to the Python list `new_list`.
4. Converts  `new_list` to a numpy array using the `np.array()` function. Consult the `array()` function [documentation](https://numpy.org/doc/stable/reference/generated/numpy.array.html) to see how to use this function.
5. Returns the` numpy` array.



### Graded Cell

The cell below will be graded. Remove the line "raise NotImplementedError()" before writing your code.

In [None]:
def array_add_constant(numpy_array, constant):
    new_list = []
    for i in numpy_array:
        new_list.append(numpy_array[i] + constant)
    return np.array(new_list)
    


### Self-Check

Run the cell below to test the correctness of your code above before submitting for grading. Do not add code or delete code in the cell.

In [None]:
# Run this self-test cell to check your code; do not add code or delete code in this cell
from jn import testFunction

try:
    p, err = testFunction(array_add_constant)
    print(err)
except Exception as e:
    print("Error!\n" + str(e))
    

## Step 4

Call your function `array_add_constant` using the arguments you created in Step 1: numpy array `a` and constant `b`. Store the result in a new variable called `ab_loop`.

### Graded Cell

The cell below will be graded. Remove the line "raise NotImplementedError()" before writing your code.

In [None]:
ab_loop = array_add_constant(a, b)

### Self-Check

Run the cell below to test the correctness of your code above before submitting for grading. Do not add code or delete code in the cell.

In [None]:
# Run this self-test cell to check your code; do not add code or delete code in this cell
from jn import testABLoop

try:
    p, err = testABLoop(array_add_constant,a,b,ab_loop)
    print(err)
except Exception as e:
    print("Error!\n" + str(e))
    

## Step 5

Note how `numpy` array `ab_loop` was created. 

Function `array_add_constant` used a few steps, including a `for` loop, to perform a mathemetical operation (addition) on each element of array `a` in order to produce array `ab_loop`.

You will now do the same thing using the NumPy way! Create a new variable called `ab_numpy`. Assign it the value of `a + b`. 



### Graded Cell

The cell below will be graded. Remove the line "raise NotImplementedError()" before writing your code.

In [None]:
ab_numpy = a + b

### Self-Check

Run the cell below to test the correctness of your code above before submitting for grading. Do not add code or delete code in the cell.

In [None]:
# Run this self-test cell to check your code; do not add code or delete code in this cell
from jn import testABNumpy

try:
    p, err = testABNumpy(a,b,ab_numpy)
    print(err)
except Exception as e:
    print("Error!\n" + str(e))
    

To further demonstrate that both techniques produce the same results, the cell below contains code to compare `ab_numpy` and `ab_loop` to check whether they are equivalent.

1. The cell contains a boolean statement that checks whether `ab_numpy` is equal to`ab_loop`. This statement will compare every element in each array and will return an array that contains boolean values (True or False) for every element. 

2. The cell also contains the boolean statement with an added `sum()` method that counts how many comparisons are False.

Run the cell below and inspect the results.


In [None]:
boolean_array = (ab_numpy != ab_loop)
print(boolean_array)

sum_of_values = (ab_numpy != ab_loop).sum()
print(sum_of_values)

##  Step 6

Your statement `ab_numpy = a + b` is an example of both <b>vectorization</b> and <b>broadcasting</b>.

You performed a mathematical operation that involved an array and one numerical value; you added a value to each element of an array using a simple arithmetic statement `a+b`. This is an example of <b>broadcasting</b>.

This also employed the technique of <b>vectorization</b>; the statement took the place of a loop.

To demonstrate how <b>vectorization</b> speeds up performance, let's compare the two methods from a time performance perspective.

The code cell below uses the `%timeit` magic function to see how long function `array_add_constant(a,b)` takes. Run the cell and inspect the results.

In [None]:
%timeit array_add_constant(a,b)

## Step 7

Now lets run the same test, but using the technique from Step 5 `a+b`. Run the cell below and inspect the results.

In [None]:
%timeit a+b

Reviewing the time it takes to add a constant to a `numpy` array using a loop vs using NumPy techniques. What is the speed difference you recognize? Usually there is a large order of magnitude speed up, demonstrating that whenever possible, always use NumPy techniques!