<h1 align="center">6. NumPy</h1>

## 6.1 Introduction 

The **NumPy** (Numerical Python) library first appeared in 2006 and is the preferred Python array implementation. It offers a high-performance, richly functional n-dimensional array type called **ndarray**, which from this point forward we’ll refer to by its synonym, array. 

NumPy is one of the many open-source libraries that the Anaconda Python distribution installs. Operations on arrays are up to two orders of magnitude faster than those on lists.

According to libraries.io, over 450 Python libraries depend on NumPy. Many popular data science libraries such as Pandas, SciPy (Scientific Python) and Keras (for deep learning) are built on or depend on NumPy.

## 6.2 Creating arrays from Existing Data

The NumPy documentation recommends importing the **numpy module** as np so that you can access its members with "np.":

In [1]:
import numpy as np

The numpy module provides various functions for creating arrays. Here we use the **array** function, which receives as an argument an array or other collection of elements and returns a new array containing the argument’s elements. 

In [2]:
numbers = np.array([2, 3, 5, 7, 11])

In [3]:
type(numbers)

numpy.ndarray

In [4]:
numbers

array([ 2,  3,  5,  7, 11])

#### Multidimensional ArgumentsMultidimensional Arguments

In [5]:
np.array([[1, 2, 3], [4, 5, 6]])

array([[1, 2, 3],
       [4, 5, 6]])

#### Exercise

Create a one-dimensional array from a list comprehension that produces the even integers from 2 through 20.

In [6]:
np.array([x for x in range(2, 21, 2)])

array([ 2,  4,  6,  8, 10, 12, 14, 16, 18, 20])

Create a 2-by-5 array containing the even integers from 2 through 10 in the first row and the odd integers from 1 through 9 in the second row.

In [7]:
np.array([[x for x in range(2, 11, 2)],[x for x in range(1, 10, 2)]])

array([[ 2,  4,  6,  8, 10],
       [ 1,  3,  5,  7,  9]])

## 6.3 array Attributes

In [8]:
integers = np.array([[1, 2, 3], [4, 5, 6]])
integers

array([[1, 2, 3],
       [4, 5, 6]])

In [9]:
floats = np.array([0.0, 0.1, 0.2, 0.3, 0.4])
floats

array([0. , 0.1, 0.2, 0.3, 0.4])

#### Determining an array’s Element Type

In [10]:
integers.dtype

dtype('int64')

In [11]:
floats.dtype

dtype('float64')

For performance reasons, NumPy is written in the C programming language and uses C’s data types. 

By default, NumPy stores integers as the NumPy type int64 values — which correspond to 64-bit (8-byte) integers in C — and stores floating-point numbers as the NumPy type float64 values — which correspond to 64-bit (8-byte) floating-point values in C.

#### Determining an array’s Dimensions

In [12]:
integers.ndim

2

In [13]:
floats.ndim

1

In [14]:
integers.shape

(2, 3)

In [15]:
floats.shape

(5,)

#### Determining an array’s Number of Elements and Element Size

In [16]:
integers.size

6

In [17]:
integers.itemsize  # 4 if C compiler uses 32-bit ints

8

In [18]:
floats.size

5

In [19]:
floats.itemsize

8

#### Iterating Through a Multidimensional array’s Elements

In [20]:
for row in integers:
    for col in row:
        print(col, end=' ')
    print()

1 2 3 
4 5 6 


In [21]:
for i in integers.flat:
    print(i, end=' ')

1 2 3 4 5 6 

## 6.4 Filling arrays with Specific Values

NumPy provides functions **zeros**, **ones** and **full** for creating arrays containing 0s, 1s or a specified value, respectively. By default, zeros and ones create arrays containing float64 values.

In [22]:
np.zeros(5)

array([0., 0., 0., 0., 0.])

In [23]:
np.ones((2, 4), dtype=int)

array([[1, 1, 1, 1],
       [1, 1, 1, 1]])

In [24]:
np.full((3, 5), 13)

array([[13, 13, 13, 13, 13],
       [13, 13, 13, 13, 13],
       [13, 13, 13, 13, 13]])

#### Exercise

Create a numpy array of size 10, filled with zeros.

In [25]:
np.array([0] * 10)

array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0])

In [26]:
np.zeros(10, dtype=int)

array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0])

Create a numpy matrix of 2*2 integers, filled with ones.

In [27]:
np.ones((2, 2), dtype=np.int64)

array([[1, 1],
       [1, 1]])

Create a numpy matrix of 4*4 integers, filled with fives.

In [28]:
np.full((4,4), 5, dtype=np.int64)

array([[5, 5, 5, 5],
       [5, 5, 5, 5],
       [5, 5, 5, 5],
       [5, 5, 5, 5]])

In [29]:
np.ones((4,4)) * 5

array([[5., 5., 5., 5.],
       [5., 5., 5., 5.],
       [5., 5., 5., 5.],
       [5., 5., 5., 5.]])

## 6.5 Creating arrays from Ranges

#### Creating Integer Ranges with arange

In [30]:
np.arange(5)

array([0, 1, 2, 3, 4])

In [31]:
np.arange(5, 10)

array([5, 6, 7, 8, 9])

In [32]:
np.arange(10, 1, -2)

array([10,  8,  6,  4,  2])

#### Creating Floating-Point Ranges with linspace

In [33]:
np.linspace(0.0, 1.0, num=11)

array([0. , 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1. ])

#### Reshaping an array

In [34]:
np.arange(1, 21).reshape(4, 5)

array([[ 1,  2,  3,  4,  5],
       [ 6,  7,  8,  9, 10],
       [11, 12, 13, 14, 15],
       [16, 17, 18, 19, 20]])

#### Exercise

Use NumPy function arange to create an array of 20 even integers from 2 through 40, then reshape the result into a 4-by-5 array.

In [35]:
np.arange(2, 41, 2).reshape(4, 5)

array([[ 2,  4,  6,  8, 10],
       [12, 14, 16, 18, 20],
       [22, 24, 26, 28, 30],
       [32, 34, 36, 38, 40]])

Create a numpy array with numbers from 1 to 10, in descending order.

In [36]:
np.arange(10,0,-1)

array([10,  9,  8,  7,  6,  5,  4,  3,  2,  1])

## 6.6 List vs. array Performance: Introducing %timeit

#### Timing the Creation of a List Containing Results of 6,000,000 Die Rolls

In [37]:
import random

In [38]:
%timeit rolls_list = [random.randrange(1, 7) for i in range(0, 6_000_000)]

988 ms ± 23.5 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


By default, %timeit executes a statement in a loop, and it runs the loop seven times. If you do not indicate the number of loops, %timeit chooses an appropriate value.

#### Timing the Creation of an array Containing Results of 6,000,000 Die Rolls

In [39]:
import numpy as np

In [40]:
%timeit rolls_array = np.random.randint(1, 7, 6_000_000)

46.1 ms ± 715 μs per loop (mean ± std. dev. of 7 runs, 10 loops each)


Operations that on average took more than 500 milliseconds iterated only once, and operations that took fewer than 500 milliseconds iterated 10 times or more.

#### 60,000,000 and 600,000,000 Die Rolls

In [41]:
%timeit rolls_array = np.random.randint(1, 7, 60_000_000)

469 ms ± 11.2 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


In [42]:
%timeit rolls_array = np.random.randint(1, 7, 600_000_000)

4.59 s ± 103 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


#### Customizing the %timeit Iterations

In [43]:
%timeit -n1 -r10 rolls_array = np.random.randint(1, 7, 6_000_000)

78.8 ms ± 10.2 ms per loop (mean ± std. dev. of 10 runs, 1 loop each)


#### Exercise

Use %timeit to compare the execution time of the following two statements. The first uses a list comprehension to create a list of the integers from 0 to 9,999,999, then totals them with the built-in sum function. The second statement does the same thing using an array and its sum method.

       sum([x for x in range(10_000_000)])
       
       np.arange(10_000_000).sum()

In [44]:
%timeit sum([x for x in range(10_000_000)])

309 ms ± 1.74 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


In [45]:
%timeit -n1 np.arange(10_000_000).sum()

18.1 ms ± 826 μs per loop (mean ± std. dev. of 7 runs, 1 loop each)


## 6.7 array Operators

#### Arithmetic Operations with arrays and Individual Numeric Values

In [46]:
import numpy as np

In [47]:
numbers = np.arange(1, 6)

In [48]:
numbers * 2

array([ 2,  4,  6,  8, 10])

In [49]:
numbers ** 2

array([ 1,  4,  9, 16, 25])

In [50]:
numbers  # numbers is unchanged by the arithmetic operators

array([1, 2, 3, 4, 5])

In [51]:
numbers += 10

In [52]:
numbers

array([11, 12, 13, 14, 15])

#### Broadcasting

Normally, the arithmetic operations require as operands two arrays of the same size and shape. When one operand is a single value, called a scalar, NumPy performs the element- wise calculations as if the scalar were an array of the same shape as the other operand, but with the scalar value in all its elements.

In [53]:
numbers * [2, 2, 2, 2, 2]

array([22, 24, 26, 28, 30])

In [54]:
numbers * 2

array([22, 24, 26, 28, 30])

#### Arithmetic Operations Between arrays

You may perform arithmetic operations and augmented assignments between arrays of the same shape.

In [55]:
numbers2 = np.linspace(1.1, 5.5, 5)

In [56]:
numbers2

array([1.1, 2.2, 3.3, 4.4, 5.5])

In [57]:
numbers * numbers2

array([12.1, 26.4, 42.9, 61.6, 82.5])

#### Comparing arrays

You can compare arrays with individual values and with other arrays. Comparisons are performed element-wise. Such comparisons produce arrays of Boolean values in which each element’s True or False value indicates the comparison result.

In [58]:
numbers

array([11, 12, 13, 14, 15])

In [59]:
numbers >= 13

array([False, False,  True,  True,  True])

In [60]:
numbers2

array([1.1, 2.2, 3.3, 4.4, 5.5])

In [61]:
numbers2 < numbers

array([ True,  True,  True,  True,  True])

In [62]:
numbers == numbers2

array([False, False, False, False, False])

#### Exercise

Create an array of the values from 1 through 5, then use broadcasting to square each value.

In [63]:
np.arange(1, 6) ** 2

array([ 1,  4,  9, 16, 25])

## 6.8 NumPy Calculation Methods

In [64]:
grades = np.array([[87, 96, 70], 
                   [100, 87, 90],
                   [94, 77, 90], 
                   [100, 81, 82]])

In [65]:
grades

array([[ 87,  96,  70],
       [100,  87,  90],
       [ 94,  77,  90],
       [100,  81,  82]])

We can use methods to calculate sum, min, max, mean, std (standard deviation) and var (variance).

In [66]:
grades.sum()

1054

In [67]:
grades.min()

70

In [68]:
grades.max()

100

In [69]:
grades.mean()

87.83333333333333

In [70]:
grades.std()

8.792357792739987

In [71]:
grades.var()

77.30555555555556

#### Calculations by Row or Column

Many calculation methods can be performed on specific array dimensions, known as the array’s axes. These methods receive an **axis** keyword argument that specifies which dimension to use in the calculation, giving you a quick way to perform calculations by row or column in a two-dimensional array.

In [72]:
grades.mean(axis=0)

array([95.25, 85.25, 83.  ])

In [73]:
grades.mean(axis=1)

array([84.33333333, 92.33333333, 87.        , 87.66666667])

NumPy arrays have many more calculation methods. For the complete list, see
https://docs.scipy.org/doc/numpy/reference/arrays.ndarray.html

#### Exercise

Use NumPy random-number generation to create an array of twelve random grades in the range 60 through 100, then reshape the result into a 3-by-4 array. Calculate the average of all the grades, the averages of the grades in each column and the averages of the grades in each row.

In [74]:
grades = np.random.randint(60, 101, 12).reshape(3, 4)

In [75]:
grades

array([[73, 87, 70, 90],
       [69, 78, 67, 83],
       [68, 81, 60, 86]])

In [76]:
grades.mean()

76.0

In [77]:
grades.mean(axis=0)

array([70.        , 82.        , 65.66666667, 86.33333333])

In [78]:
grades.mean(axis=1)

array([80.  , 74.25, 73.75])

## 6.9 Universal Functions

NumPy offers dozens of standalone **universal functions** (or ufuncs) that perform various element-wise operations. Each performs its task using one or two array or array-like (such as lists) arguments. Some of these functions are called when you use operators like + and * on arrays. Each returns a new array containing the results.


In [79]:
numbers = np.array([1, 4, 9, 16, 25, 36])

In [80]:
np.sqrt(numbers)

array([1., 2., 3., 4., 5., 6.])

In [81]:
numbers2 = np.arange(1, 7) * 10
numbers2

array([10, 20, 30, 40, 50, 60])

In [82]:
np.add(numbers, numbers2)

array([11, 24, 39, 56, 75, 96])

In [83]:
numbers + numbers2

array([11, 24, 39, 56, 75, 96])

#### Broadcasting with Universal Functions

In [116]:
np.multiply(numbers2, 5)


array([ 5, 10, 15])

In [85]:
numbers2 * 5

array([ 50, 100, 150, 200, 250, 300])

In [86]:
numbers3 = numbers2.reshape(2, 3)
numbers3

array([[10, 20, 30],
       [40, 50, 60]])

In [87]:
numbers4 = np.array([2, 4, 6])

In [88]:
np.multiply(numbers3, numbers4)

array([[ 20,  80, 180],
       [ 80, 200, 360]])

The NumPy documentation lists universal functions in five categories — math, trigonometry, bit manipulation, comparison and floating point.
https://docs.scipy.org/doc/numpy/reference/ufuncs.html

#### Exercise

Create an array of the values from 1 through 5, then use the power universal function and broadcasting to cube each value.

In [89]:
numbers = np.arange(1, 6)
np.power(numbers, 3)

array([  1,   8,  27,  64, 125])

## 6.10 Indexing and Slicing

One-dimensional arrays can be indexed and sliced.

#### Indexing with Two-Dimensional arrays

In [90]:
grades = np.array([[87, 96, 70], [100, 87, 90],
                   [94, 77, 90], [100, 81, 82]])
grades

array([[ 87,  96,  70],
       [100,  87,  90],
       [ 94,  77,  90],
       [100,  81,  82]])

#### Selecting a Subset of a Two-Dimensional array’s Rows

In [91]:
grades[1]

array([100,  87,  90])

In [92]:
grades[0:2]

array([[ 87,  96,  70],
       [100,  87,  90]])

In [93]:
grades[[1, 3]]

array([[100,  87,  90],
       [100,  81,  82]])

#### Selecting a Subset of a Two-Dimensional array’s Columns

In [94]:
grades[:, 0]

array([ 87, 100,  94, 100])

In [95]:
grades[:, 1:3]

array([[96, 70],
       [87, 90],
       [77, 90],
       [81, 82]])

In [96]:
grades[:, [0, 2]]

array([[ 87,  70],
       [100,  90],
       [ 94,  90],
       [100,  82]])

#### Exercise

Given the following array:
array([[ 1, 2, 3, 4, 5], [6, 7, 8, 9,10],
              [11, 12, 13, 14, 15]])
              
* Select the second row.
* Select the first and third rows.
* Select the middle three columns.

In [97]:
a = np.arange(1, 16).reshape(3, 5)

In [98]:
a[1]

array([ 6,  7,  8,  9, 10])

In [99]:
a[[0, 2]]

array([[ 1,  2,  3,  4,  5],
       [11, 12, 13, 14, 15]])

In [100]:
a[:, 1:-1]

array([[ 2,  3,  4],
       [ 7,  8,  9],
       [12, 13, 14]])

## 6.11 Views: Shallow Copies

View objects: objects that “see” the data in other objects, rather than having their own copies of the data.

Views are also known as **shallow copies**. Various array methods and slicing operations produce views of an array’s data.


The array method **view** returns a _new_ array object with a _view_ of the original array object’s data.

In [101]:
numbers = np.arange(1, 6)
numbers

array([1, 2, 3, 4, 5])

In [102]:
numbers2 = numbers.view()
numbers2

array([1, 2, 3, 4, 5])

We can use the built-in **id** function to see that numbers and numbers2 are _different_ objects:

In [103]:
id(numbers)

140238865731664

In [104]:
id(numbers2)

140238865731184

To prove that numbers2 views the _same_ data as numbers, let’s modify an element in numbers, then display both arrays:

In [105]:
numbers[1] *= 10

In [106]:
numbers2

array([ 1, 20,  3,  4,  5])

In [107]:
numbers

array([ 1, 20,  3,  4,  5])

Similarly, changing a value in the view also changes that value in the original array:

In [108]:
numbers2[1] /= 10

In [109]:
numbers2

array([1, 2, 3, 4, 5])

In [110]:
numbers

array([1, 2, 3, 4, 5])

#### Slice Views

Slices also create views. Let’s make numbers2 a slice that views only the first three elements of numbers:

In [111]:
numbers2 = numbers[0:3]

In [112]:
numbers2

array([1, 2, 3])

Again, we can confirm that numbers and numbers2 are different objects with id:

In [113]:
id(numbers)

140238865731664

In [114]:
id(numbers2)

140238865731952

We can confirm that numbers2 is a view of _only_ the first _three_ numbers elements by attempting to access numbers2[3], which produces an IndexError:

In [115]:
numbers2[3]

IndexError: index 3 is out of bounds for axis 0 with size 3

Now, let’s modify an element both arrays share, then display them. Again, we see that numbers2 is a view of numbers:

In [None]:
numbers[1] *= 20

In [None]:
numbers

In [None]:
numbers2

## 6.12 Deep Copies

Though views are _separate_ array objects, they save memory by sharing element data from other arrays. However, when sharing _mutable_ values, sometimes it’s necessary to create a **deep copy** with _independent_ copies of the original data.

The array method **copy** returns a new array object with a deep copy of the original array object’s data.

In [None]:
numbers = np.arange(1, 6)
numbers

In [None]:
numbers2 = numbers.copy()
numbers2

To prove that numbers2 has a separate copy of the data in numbers, let’s modify an element in numbers, then display both arrays:

In [None]:
numbers[1] *= 10
numbers

In [None]:
numbers2

## 6.13 Reshaping and Transposing

NumPy provides various ways to reshape arrays.

#### reshape vs. resize

The array methods reshape and resize both enable you to change an array’s dimensions. Method reshape returns a _view_ (shallow copy) of the original array with the new dimensions.

In [None]:
grades = np.array([[87, 96, 70], [100, 87, 90]])
grades

In [None]:
grades.reshape(1, 6)

In [None]:
grades

Method **resize** modifies the original array’s shape:

In [None]:
grades.resize(1, 6)

In [None]:
grades

#### flatten vs. ravel

You can take a multidimensional array and flatten it into a single dimension with the methods **flatten** and **ravel**. Method flatten _deep copies_ the original array’s data:

In [None]:
grades = np.array([[87, 96, 70], [100, 87, 90]])

In [None]:
grades

In [None]:
flattened = grades.flatten()
flattened

In [None]:
grades

In [None]:
flattened[0] = 100
flattened

In [None]:
grades

Method **ravel** produces a _view_ of the original array, which shares the grades array’s data:

In [None]:
raveled = grades.ravel()
raveled

In [None]:
raveled[0] = 100
raveled

In [None]:
grades

#### Transposing Rows and Columns

You can quickly **transpose** an array’s rows and columns — that is “flip” the array, so the rows become the columns and the columns become the rows. 

The **T attribute** returns a transposed _view_ (shallow copy) of the array. 

In [None]:
grades.T

In [None]:
grades

#### Horizontal and Vertical Stacking

You can combine arrays by adding more columns or more rows — known as _horizontal stacking_ and _vertical stacking_. 

In [None]:
grades2 = np.array([[94, 77, 90], [100, 81, 82]])

In [None]:
np.hstack((grades, grades2))

In [None]:
np.vstack((grades, grades2))

#### Exercise

Given a 2-by-3 array: 
array(  [[1, 2, 3],
        [4, 5, 6]])

use hstack and vstack to produce the following array:

    array(   [[1, 2, 3, 1, 2, 3],
              [4, 5, 6, 4, 5, 6],
              [1, 2, 3, 1, 2, 3],

In [None]:
a = np.arange(1, 7).reshape(2, 3)

In [None]:
a = np.hstack((a, a))

In [None]:
a = np.vstack((a, a))
a