## Chapter 4

# NumPy : Arrays and Vectorized Computation

NumPy internally stores data in a contiguous block of memory, independent of
other built-in Python objects.
NumPy’s library of algorithms written in the C language can operate on this memory without any type checking or other overhead.
NumPy arrays also use much less memory than built-in Python sequences.
NumPy operations perform complex computations on entire arrays without the
need for Python for loops.
To give you an idea of the performance difference, consider a NumPy array of one
million integers, and the equivalent Python list:

In [2]:
import numpy as np

In [3]:
my_arr = np.arange(1000000)

In [4]:
my_list = list(range(1000000))

In [7]:
%time for _ in range(10): my_arr2 = my_arr * 2

CPU times: user 41.4 ms, sys: 19 ms, total: 60.5 ms
Wall time: 70 ms


In [29]:
%time for _ in range(10): my_list2 = [x * 2 for x in my_list]

CPU times: user 520 ms, sys: 103 ms, total: 623 ms
Wall time: 623 ms


NumPy-based algorithms are generally 10 to 100 times faster (or more) than their
pure Python counterparts and use significantly less memory.

# 1.1 The NumPy ndarray: A Multidimensional Array Object

One of the key features of NumPy is its N-dimensional array object, or ndarray,
which is a fast, flexible container for large datasets in Python. Arrays enable you to
perform mathematical operations on whole blocks of data using similar syntax to the
equivalent operations between scalar elements.

In [32]:
data = np.random.randn(2,3)

In [34]:
data

array([[ 0.28081012, -2.22844241, -1.06032421],
       [-0.86947304, -1.01906417, -0.85600008]])

In [36]:
data * 10

array([[  2.80810121, -22.28442406, -10.60324208],
       [ -8.69473044, -10.19064168,  -8.5600008 ]])

In [37]:
data + data

array([[ 0.56162024, -4.45688481, -2.12064842],
       [-1.73894609, -2.03812834, -1.71200016]])

An ndarray is a generic multidimensional container for homogeneous data; that is, all
of the elements must be the same type. 

Every array has a shape , a tuple indicating the
size of each dimension, and a dtype , an object describing the data type of the array:

In [41]:
data.shape

(2, 3)

In [42]:
data.dtype

dtype('float64')

a. Creating ndarrays

In [47]:
data1 = [1,2,3.45,6,7,22]

In [48]:
arr1 = np.array(data1)

In [49]:
arr1

array([ 1.  ,  2.  ,  3.45,  6.  ,  7.  , 22.  ])

Nested sequences, like a list of equal-length lists, will be converted into a multidimen‐sional array:

In [52]:
data2 = [[1,2,3,4],[5,6,7,8]]

In [53]:
arr2 = np.array(data2)

In [54]:
arr2

array([[1, 2, 3, 4],
       [5, 6, 7, 8]])

Since data2 was a list of lists, the NumPy array arr2 has two dimensions with shape
inferred from the data. We can confirm this by inspecting the ndim and shape
attributes:

In [56]:
arr2.ndim

2

In [57]:
arr2.shape

(2, 4)

Unless explicitly specified, np.array tries to infer a good data
type for the array that it creates. The data type is stored in a special dtype metadata
object; for example, in the previous two examples we have:

## 

In [58]:
arr1.dtype

dtype('float64')

In [59]:
arr2.dtype

dtype('int64')

In addition to np.array , there are a number of other functions for creating new arrays.

Specify what you want from right to left fashion

In [67]:
np.zeros(10)

array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0.])

In [68]:
np.zeros((3,6))

array([[0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0.]])

In [70]:
np.zeros((4,3,2))

array([[[0., 0.],
        [0., 0.],
        [0., 0.]],

       [[0., 0.],
        [0., 0.],
        [0., 0.]],

       [[0., 0.],
        [0., 0.],
        [0., 0.]],

       [[0., 0.],
        [0., 0.],
        [0., 0.]]])