# Tutorial 2.5: NumPy Arithmetic Functions
Python for Data Analytics | Module 2  
Professor James Ng

In this tutorial, we will explore a number of ways to perform mathematical calculations on Numpy arrays. But, before we get started with that, let's take a bit of an educational diversion.

## A Speed Demonstration
There have been a number of times that I've mentioned that NumPy `ndarray` objects provide massive speed improvements in comparison with standard `list` objects when performing equivalent tasks.

Let's take a moment to demonstrate how much of a difference there can be. I'll use an example here that is very similar to one that you will find in in the *Python Data Science Handbook* by Jake Vanderplas.

First we will demonstrate the amount of time it takes to calculate the reciprocal value for every item in a list. Then we will demonstrate the same operation using NumPy and compare the difference.

In [None]:
# Standard Imports
import numpy as np
import pandas as pd
np.random.seed(0)

In [None]:
# Define a function with a single parameter 
# which will be a list. It will return another list
# with the reciprocal values of the original list.
def compute_reciprocals(a_list): 
    
    # Create an empty list called `output` to hold our results
    output = list()
    
    # Operator on each item in the `a_list` parameter,
    for value in a_list:

        # Add a new element to the `output` list that is 
        # the reciprocal value of the original element.
        output.append(1.0 / value)
        
    # Return the new array.   
    return output
        
# Call our function with an array with the numbers 1, 2, 3, 4, 5
small_example = [1, 2, 3, 4, 5]
compute_reciprocals(small_example)

Alright, now let's time how long that operation takes with `%time` magic command:

In [None]:
%time compute_reciprocals(small_example)

Ok, that was fast. Actually, that was *really* fast. So far, Python is looking pretty good.

But...we were only dealing with an array of 5 elements. No self-respecting data scientist does that! Let's try again with an array of 5 million elements!

In [None]:
# After you submit this, you're gonna have to wait a few seconds for the result

# First, create a `ndarray` of 5 million random numbers between 1 & 100
# Then convert it to a list and send it to our `compute_reciprocals()` function.
big_list = list(np.random.randint(1, 100, size=5000000))
test = %time compute_reciprocals(big_list)

**That took a while.**

It might not seem like much, but there is no way that's gonna work for us when we've got big data sets to evaluate.

The crux of the problem here is that, for each item in the list, Python is evaluating the data type. It has to do this in order to know how to apply the mathematical operation to it.

## UFuncs to the Rescue

The NumPy package has **UFuncs**, or **Universal Functions** which can dramatically speed up operations on `ndarray` elements. They are also referred to as **vectorized** operations. You've actually already used some of these when you did array comparisons.

Basically, these functions push loop processing into the C code that lies underneath Python/NumPy so that operations are performed much faster than normal. We don't need to understand the specifics of how this is accomplished - just the knowledge that it *does* happen.

This only works because all the data elements of an array are of the same type.

### A speed comparison, if you please...
So, as a reminder, our `compute_reciprocals` function simply takes a `list` and then returns another `list` whose values are the reciprocals of the original array values.

Now I will demonstrate the same operation using a Numpy UFuncs, which we will just refer to as **Vectorized Numpy Functions**. I'll explain a bit more about what is going on behind the scenes of this syntax below. For now, just focus on the speed difference.

In [None]:
# Get an equivalent array of 1 million elements
# the list we created previously
big_array = np.random.randint(1, 100, size=5000000)

# Now time the UFunc approach.
# Remember, the other way took a looooong time.
%time np.divide(1.0, big_array)

*We just went from seconds to milliseconds.* 

**That's an incredible increase in speed.**

## Arithmetic Vertorized Functions
As we just demonstrated, there is a vectorized Numpy function for division. It probably will not surprise you then to discover that all the normal Python arithmetic operations are replicated with NumPy equivalents.

Here are some examples:

In [None]:
# Given a `simple_int_array`
simple_int_array = np.array([1, 2, 3, 4, 5])
simple_int_array

In [None]:
# Add 5 to each array element
simple_int_array + 5

In [None]:
# Subtract 10 from each element
simple_int_array - 10

In [None]:
# Subtract each element from 10
10 - simple_int_array

In [None]:
# You can also perform multiple operations.
# Standard order of operations is followed.

# Raise each element to the 3rd power and subtract 10
simple_int_array ** 3 - 10

<div class="alert alert-block alert-info">
<h4>Feeling curious?</h4>
Go ahead and try to do any of these operations with a `list` object. You won't like the results.
</div> 

## Quick Exercise: 


In [None]:
# From `mylist` below, create a new list that contains each element subtracted by 10.
mylist = [11, 12, 13, 14, 15]
    

### So, what is NumPy actually doing here?
Behind the scenes of vectorized functions, NumPy is executing a loop. This is true for all the examples we just demonstrated. 

When NumPy sees `simple_int_array + 5`, it interprets that as, "add 5 to each element of this array and return the results as a new array". NumPy always carries out the indicated operation on ***each element*** of the array object or objects that it is given.

This was true when we used array comparison functions, and it is true here as well.

### An Alternative Syntax
In additional to using standard mathematical operators (i.e., + - * /, etc.) you can also accomplish the same thing by invoking the arithmetic functions by their names.

For example:

In [None]:
# Add 3.5 to each element of our `simple_int_array`
# Notice how the ints are "upcasted" to floats?
np.add(3.5, simple_int_array)


In [None]:
# Divide each array element by 3
np.divide(simple_int_array, 4)

In [None]:
# And notice that the order of parameters is important
# when dividing and substracting...

# This...
print(np.divide(4, simple_int_array))

# Is very different than this...
print(np.divide(simple_int_array, 4))

Here is the summary table of common arithmetic functions available to you:

| Operator      | Equivalent ufunc    | Description |                         
|---------------|---------------------|---------------------------------------|
|``+``          |``np.add``           |Addition (e.g., ``1 + 1 = 2``)         |
|``-``          |``np.subtract``      |Subtraction (e.g., ``3 - 2 = 1``)      |
|``-``          |``np.negative``      |Unary negation (e.g., ``-2``)          |
|``*``          |``np.multiply``      |Multiplication (e.g., ``2 * 3 = 6``)   |
|``/``          |``np.divide``        |Division (e.g., ``3 / 2 = 1.5``)       |
|``//``         |``np.floor_divide``  |Floor division (e.g., ``3 // 2 = 1``)  |
|``**``         |``np.power``         |Exponentiation (e.g., ``2 ** 3 = 8``)  |
|``%``          |``np.mod``           |Modulus/remainder (e.g., ``9 % 4 = 1``)|

## Operations between two NumPy arrays
So far, we've only used one `ndarray` object when using the arithmetic functions. The other value in our operations has always been a "scalar" value. 

For those who are not programming experts, a **scalar** value simply means that is in an object with a single value -- like a number. This is opposed to a **container**-type object like a `list` or `ndarray` that holds multiple values.

Now let's go through a few examples of using these functions with two arrays.

In [None]:
# Let's create two new arrays.
# One will have the numbers 1-5 and the other 6-10.
one_to_five = np.arange(1, 6)
six_to_ten = np.arange(6, 11)

print(one_to_five, six_to_ten)

In [None]:
# Now let's add them together.
# Notice how it takes the 1st element of both arrays and adds them together,
# then the second, and so on...
np.add(one_to_five, six_to_ten)

In [None]:
# The same thing will happen with other operations.
# Here will we divide each element of `one_to_five` by `six_to_ten`.
one_to_five / six_to_ten

### Limits

<p>Being able to perform mathematical operations between two arrays is a really powerful tool. But, take note that *this only works when you have two arrays of the same size and shape*.</p>
</div>