
----

# NumPy Arrays



---

### Table of Contents


1 - [Creating NumPy Arrays](#section1)<br>

2 - [Basic Operations with NumPy Arrays](#section2)<br>

3 - [Indexing & Slicing NumPy Arrays](#section3)<br>

---

NumPy (short for "Numerical Python") is a Python library that allows us to work with arrays and easily process large amounts of numerical data. This means we must import the library in order to use it! 

In order to save us some typing time we can give our libraries a shorter alias, like <b>np</b> for NumPy.

In [2]:
import numpy as np
from matplotlib import pyplot as plt
%matplotlib inline

## Creating Numpy Arrays  <a id='section1'></a>


A NumPy array is just a table of data of the same type. From [NumPy.org](https://numpy.org/doc/stable/user/absolute_beginners.html): NumPy arrays can be used to perform a wide variety of mathematical operations on arrays. It adds powerful data structures to Python that guarantee efficient calculations with arrays and matrices and it supplies an enormous library of high-level mathematical functions that operate on these arrays and matrices.


In order to use NumPy you can either convert data you already have into a NumPy array or create a blank array from scratch. NumPy arrays and Python lists are similar, yet they react differently to various operations.

In [3]:
#Create an array
array1 = np.array([1,2,3,4])

#Create a list
list1 = [1,2,3,4]

array1, list1

(array([1, 2, 3, 4]), [1, 2, 3, 4])

Try doing the following operations and see if you notice any differences!

array1 + array1

list1 + list1

array1 * array1

list1 * list1

In [11]:
# EXERCISE

print(array1 + array1) # add two arrays
print(list1 + list1) # add two lists

[2 4 6 8]
[1, 2, 3, 4, 1, 2, 3, 4]


In [12]:
# EXERCISE

# multiply two arrays
# multiply two lists

print(array1 * array1)
print(list1 * list1)

[ 1  4  9 16]


TypeError: can't multiply sequence by non-int of type 'list'

In [13]:
# EXAMPLE

# Create a new list

list_of_numbers = [0, 1, 2, 3, 4, 5]
list_of_numbers

[0, 1, 2, 3, 4, 5]

In [14]:
# Example

# Verify that it is indeed a regular Python list

type(list_of_numbers)

list

In [15]:
# EXAMPLE

#Create a new Numpy array from our list and display it

array_from_list = np.array(list_of_numbers)
array_from_list

array([0, 1, 2, 3, 4, 5])

In [16]:
# EXERCISE

# Verify that it's an array and not a list

type(array_from_list)

numpy.ndarray

Usually, we get arrays from our data. If we don't yet have any data or just want a placeholder for our data, we can create a NumPy array filled with ones by calling np.ones() and specifying the size of the array in a tuple, (3, 4).

*A tuple is another data type in Python. It's similar to a list, but uses parentheses instead of square brackets, and unlike lists, are unchangeable*

In [17]:
# EXAMPLE

#Create an array of size 3x4 and fill it with ones

ones_array = np.ones((3, 4))
ones_array

array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])

We can also create arrays filled with zeros.

In [18]:
# EXERCISE

# Create a 3x4 array of zeroes
# It will have a similar syntax, but instead we need to call 
# a function "zeros", not "ones" from our NumPy library

zeros_array = np.zeros((3, 4))
zeros_array

array([[0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.]])

In [19]:
import numpy as np

In [20]:
x = np.array([[1,2,3],[4,5,6],[7,8,9]])
print(x)

[[1 2 3]
 [4 5 6]
 [7 8 9]]


Similarly we can create NumPy arrays filled with any other value. To accomplish this we use `np.full` and specify both the size of the array and the value we would like to fill that array with.

In [21]:
# EXAMPLE

array_of_twos = np.full((3, 4), 2)
array_of_twos

array([[2, 2, 2, 2],
       [2, 2, 2, 2],
       [2, 2, 2, 2]])

In [22]:
# EXERCISE

# Create a 2x5 array of halves: "0.5" or "1/2"

array_of_halves = np.full((2, 5), 0.5)
array_of_halves

array([[0.5, 0.5, 0.5, 0.5, 0.5],
       [0.5, 0.5, 0.5, 0.5, 0.5]])

In [23]:
# EXAMPLE

x = np.array ( [1,2,3] )
y = np.array ( [4,5,6] )
# add array x to array y and store it in a result array
result = np.add(x,y)
result

array([5, 7, 9])

In [24]:
# EXERCISE

# Modify the above code to subtract array x from array y

x = np.array ( [1,2,3] )
y = np.array ( [4,5,6] )

result = np.subtract(y,x)
result

array([3, 3, 3])

In the cell below, create a 4x3 array filled with Pi.

**Hint:** Similar to the math library, you can use the the numpy library to import the value of pi.

In [25]:
# EXERCISE

array_of_pies = np.full((4,3), np.pi)
array_of_pies

array([[3.14159265, 3.14159265, 3.14159265],
       [3.14159265, 3.14159265, 3.14159265],
       [3.14159265, 3.14159265, 3.14159265],
       [3.14159265, 3.14159265, 3.14159265]])

If we want NumPy to fill the array with random numbers between zero and one, we can use `np.random.rand()`.
Although counterintuitive, for this function we do not need to put the size of the array in a tuple. Instead we just give NumPy the dimensions of the array directly.  

In [26]:
# EXAMPLE

np.random.rand(2,3)

array([[0.21927748, 0.07852753, 0.15355667],
       [0.86288264, 0.01347372, 0.8528799 ]])

## Basic Operations with Numpy Arrays  <a id='section2'></a>


NumPy is much more than just an array creator. It allows us to do blazingly fast operations with arrays. Operations performed with NumPy on arrays can be computed significantly faster than with other Python functions on lists.
For example, let's say that we have an array of one million random probabilities of it raining on a particular day.

In [27]:
# EXAMPLE

random_million = np.random.rand(1000000, 1)
random_million

array([[0.00849633],
       [0.88625951],
       [0.607407  ],
       ...,
       [0.47147279],
       [0.10976026],
       [0.87156168]])

Let's check that the array is actully one million numbers long. 

**Hint:** Just like with most other data structures (lists) and some data types (strings), you can use the traditional Python len( ) function to check the *length* of your object.

In [28]:
# EXERCISE
len(random_million)

1000000

Yep, that checks out.<br> <br>Now let's say that we want these probabilities to be in percentages (out of 100 rather) than proportions (from zero to one). We can just multiply the whole array by 100!

In [30]:
%%time
# EXAMPLE

percentages = random_million * 100
percentages

CPU times: total: 0 ns
Wall time: 1.99 ms


array([[ 0.84963305],
       [88.62595074],
       [60.7406997 ],
       ...,
       [47.14727852],
       [10.9760262 ],
       [87.15616758]])

Notice that NumPy accomplishes that multiplication in a fraction of a second when we use it with arrays. That's a million multiplications! In the cell below you can see how much longer the code is for doing the same operations on a list. 

**Note:** We cut our list to have 10,000 values only, because it can take it a long time to run a for-loop over 1 million values. 

In [34]:
%%time

# EXAMPLE

random_mln_lst = list(random_million[:100000])
percentages_lst = [] 

for i in random_mln_lst:
    percentages_lst += [i*100] 
    
# print(percentages_lst)    

CPU times: total: 78.1 ms
Wall time: 167 ms


 Let's see if we can get NumPy to at least break a sweat doing multiplications.

In [35]:
%%time
# EXAMPLE

# 100 million multiplications
np.random.rand(100000000) * 100

CPU times: total: 734 ms
Wall time: 1.11 s


array([43.33850829, 36.95096887, 64.46785933, ..., 69.19833892,
       85.31574162, 53.02085393])

NumPy is relly fast! Not to mention that it first needs to come up with the random numbers in the array, and only then can it do the multiplications we are asking it to do. That's pretty useful if you want to analyze a huge amount of data!

Let's see what other operations we can do with NumPy arrays.

Can it add the same number to all the elements of our array? How about subtracting it? Even dividing by it?

In [36]:
# EXAMPLE

plus_fifty = random_million + 50
plus_fifty

array([[50.00849633],
       [50.88625951],
       [50.607407  ],
       ...,
       [50.47147279],
       [50.10976026],
       [50.87156168]])

How about if we want to divide each value by 2?

In [37]:
# EXERCISE

divided_by_two = random_million / 2
divided_by_two

array([[0.00424817],
       [0.44312975],
       [0.3037035 ],
       ...,
       [0.23573639],
       [0.05488013],
       [0.43578084]])

We can even do those arithmetic operations between two arrays if they are of the same size!

In [38]:
# EXAMPLE

sum_of_arrays = divided_by_two + plus_fifty
sum_of_arrays

array([[50.0127445 ],
       [51.32938926],
       [50.9111105 ],
       ...,
       [50.70720918],
       [50.16464039],
       [51.30734251]])

In [39]:
# EXAMPLE

x = np.array ([2,3,4])
np.exp(x)

array([ 7.3890561 , 20.08553692, 54.59815003])

In [40]:
# EXERCISE

#Modify the above code to calculate the square root of x.
x = np.array ([2,3,4])
np.sqrt(x)

array([1.41421356, 1.73205081, 2.        ])

## Indexing & Slicing Numpy Arrays  <a id='section3'></a>


Just like we did with Python lists, if we ever need to retreive a value at a particular index in a NumPy array, we can use [num:num] to get it.

In [41]:
# EXAMPLE

print(array_from_list)
print("Value at index 0 is:", array_from_list[0])

[0 1 2 3 4 5]
Value at index 0 is: 0


Try to retrieve a value at index **3** of our array.

In [42]:
# EXERCISE

print("Value at index 3 is:", array_from_list[3]) 

Value at index 3 is: 3


We can also get a "slice" of numbers just like we would from a list. Remember that we slice up to the stopindex - 1!

In [43]:
# EXAMPLE

array_from_list[2:5]

array([2, 3, 4])

Now how would you return all the values starting with index 2 (skipping values at indices 0 and 1)?

In [44]:
# EXERCISE

array_from_list[2:]

array([2, 3, 4, 5])

You can think of arrays as tables. If your array has more than one column per row, we just use a comma between the index of the first dimension (row) and the index of the second dimension (column). The indexing and slicing works exactly the same as before, but we can do it separately for rows and columns.

In [45]:
# EXAMPLE

three_by_five = np.random.rand(5, 3)
three_by_five

array([[0.6983186 , 0.71496761, 0.66618526],
       [0.10178713, 0.44811344, 0.46431533],
       [0.00098478, 0.76585604, 0.85140256],
       [0.15852084, 0.90010877, 0.84401433],
       [0.74698854, 0.90041702, 0.47806698]])

In [46]:
# EXAMPLE

three_by_five[0, 0]

0.6983186043844742

Now let's try to retrieve the value from the bottom right. 

**Hint:** Remember, first we input rows, then columns. Also, don't forget that Python starts counting from *zero*.

In [48]:
# EXERCISE

three_by_five[-1,-1]

0.4780669775207316

Just like with regular arrays and lists, we can slice arrays that have multiple columns. The same rules apply: first we input the desired rows, then the desired columns. In the cell below, we are asking for our array to output values on row 0, columns 1 through 3 (our array has only 2 columns, but it won't error). 

In [49]:
# EXAMPLE

three_by_five[0, 1:4]

array([0.71496761, 0.66618526])

In [50]:
# EXAMPLE - output rows 1 through 4, column 0

three_by_five[1:, :1]

array([[0.10178713],
       [0.00098478],
       [0.15852084],
       [0.74698854]])

Now how would we output only the last values of all rows?

In [53]:
# EXERCISE

three_by_five[:, -1]

array([0.66618526, 0.46431533, 0.85140256, 0.84401433, 0.47806698])

---
Notebook developed by: Kseniya Usovich, Baishakhi Bose, Alisa Bettale, Laurel Hales