# Chap 4. NumPy Basics: Arrays and Vectorized Computation

- [4.1 The NumPy ndarray: A Multidimensional Array Object](#41)
    - [4.1.1 Creating ndarrays](#411)
    - [4.1.2 Data Types for ndarrays](#412)
    - [4.1.3 Arithmetic with NumPy Arrays](#413)
    - [4.1.4 Basic Indexing and Slicing](#414)
    - [4.1.5 Boolean Indexing](#415)
    - [4.1.6 Fancy Indexing](#416)

Because NumPy provides an easy-to-use C API, it is straightforward to pass data to external libraries written in a low-level language and also for external libraries to return data to Python as NumPy arrays. This feature has made Python a language of choice for wrapping legacy C/C++/Fortran codebases and giving them a dynamic and easy-to-use interface.

In [3]:
import numpy as np

In [4]:
my_arr = np.arange(1000000)
my_list = list(range(1000000))

In [5]:
%time for _ in range(10): my_arr2 = my_arr * 2

CPU times: user 25.1 ms, sys: 23.6 ms, total: 48.7 ms
Wall time: 51.6 ms


In [6]:
%time for _ in range(10): my_list2 = [x * 2 for x in my_list]

CPU times: user 857 ms, sys: 224 ms, total: 1.08 s
Wall time: 1.09 s


> _NumPy-based algorithms are generally 10 to 100 times faster (or more) than their pure Python counterparts and use significantly less memory_

## 4.1 The NumPy ndarray: A Multidimensional Array Object 
<a id="41"></a>

In [7]:
# Generate some random data
data = np.random.randn(2, 3)
data

array([[ 0.58122518, -1.42456338, -0.31834905],
       [-0.47119944, -1.06181187,  0.66714829]])

In [8]:
data * 10

array([[  5.81225184, -14.24563381,  -3.18349052],
       [ -4.71199443, -10.61811866,   6.67148291]])

In [9]:
data + data

array([[ 1.16245037, -2.84912676, -0.6366981 ],
       [-0.94239889, -2.12362373,  1.33429658]])

In [11]:
data.shape

(2, 3)

In [12]:
data.dtype

dtype('float64')

### 4.1.1 Creating ndarrays
<a id="411"></a>

In [15]:
data1 = [6, 7.5, 8, 0, 1]
arr1 = np.array(data1)
arr1

array([ 6. ,  7.5,  8. ,  0. ,  1. ])

In [16]:
# Nested sequences, like a list of equal-length lists, will be converted into a multidimen‐ sional array
data2 = [[1, 2, 3, 4], [5, 6, 7, 8]]
arr2 = np.array(data2)
arr2

array([[1, 2, 3, 4],
       [5, 6, 7, 8]])

In [20]:
# one-dimension
np.zeros(10)

array([ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.])

In [21]:
# two-dimension
np.zeros((3, 6))

array([[ 0.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  0.]])

In [31]:
# three-dimension
np.empty((2, 3, 2), dtype='f2')

array([[[  0.00000000e+00,   0.00000000e+00],
        [  0.00000000e+00,  -0.00000000e+00],
        [  3.53515625e+00,   4.47360000e+04]],

       [[  1.21593475e-04,  -5.12000000e+02],
        [  1.19209290e-07,   0.00000000e+00],
        [  0.00000000e+00,  -0.00000000e+00]]], dtype=float16)

In [25]:
# arange is an array-valued version of the built-in Python range function
np.arange(15)

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14])

### 4.1.2 Data Types for ndarrays
<a id="412"></a>
dtypes are a source of NumPy’s flexibility for interacting with data coming from other systems

In [26]:
arr1 = np.array([1, 2, 3], dtype=np.float64)
arr2 = np.array([1, 2, 3], dtype=np.int32)

Type|Type code|Description
-|-|-
int8, uint8|i1, u1|Signed and unsigned 8-bit (1 byte) integer types
int16, uint16|i2, u2|Signed and unsigned 16-bit integer types
int32, uint32|i4, u4|Signed and unsigned 32-bit integer types
int64, uint64|i8, u8|Signed and unsigned 64-bit integer types
float16|f2|Half-precision foating point
float32|f4 or f|Standard single-precision floating point; compatible with C float
float64|f8 or d|Standard double-precision f;oating point; compatible with C double and Python float object
float128|f16 or g|Extended-precision floating point
complex64, complex128, complex256|c8, c16, c32|Complex numbers represented by two 32, 64, or 128  oats, respectively
bool|?|Boolean type storing True and False values
object|O|Python object type; a value can be any Python object
string_|S|Fixed-length ASCII string type (1 byte per character); for example, to create a string dtype with length 10, use'S10'
unicode_|U|Fixed-length Unicode type (number of bytes platform speci c); same speci cation semantics asstring_(e.g.,'U10')

In [28]:
# You can explicitly convert or cast an array from one dtype to another using ndarray’s astype method
arr = np.array([1, 2, 3, 4, 5])
arr.dtype

dtype('int64')

In [29]:
float_arr = arr.astype(np.float64)
float_arr.dtype

dtype('float64')

### 4.1.3 Arithmetic with NumPy Arrays
<a id='413'></a>
Arrays are important because they enable you to express batch operations on data without writing any for loops. NumPy users call this ***vectorization***

In [32]:
arr = np.array([[1., 2., 3.], [4., 5., 6.]])
arr

array([[ 1.,  2.,  3.],
       [ 4.,  5.,  6.]])

In [33]:
arr * arr

array([[  1.,   4.,   9.],
       [ 16.,  25.,  36.]])

In [34]:
arr - arr

array([[ 0.,  0.,  0.],
       [ 0.,  0.,  0.]])

In [35]:
# Comparisons between arrays of the same size yield boolean arrays
arr2 = np.array([[4., 1., 4.], [2., 15., 3.]])
arr > arr2

array([[False,  True, False],
       [ True, False,  True]], dtype=bool)

### 4.1.4 Basic Indexing and Slicing
<a id='414'></a>

||axis=1|axis=1|axis=1
-|-|-|-|-
||0|1|2
axis=0|0|0,0|0,1|0,2
axis=0|1|1,0|1,1|1,2
axis=0|2|2,0|2,1|2,2

In [37]:
arr2d = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
arr2d[2]

array([7, 8, 9])

In [39]:
arr2d[0][2]

3

In [41]:
arr2d[0, 2]

3

> If you need to create a copy on an 2d or 3d array, you need to explicitly use ***np.copy()***

In [45]:
# So in the 2 × 2 × 3 array arr3d:
arr3d = np.array([[[1, 2, 3], [4, 5, 6]], [[7, 8, 9], [10, 11, 12]]])

In [43]:
old_values = arr3d
arr3d[0] = 42
arr3d

array([[[42, 42, 42],
        [42, 42, 42]],

       [[ 7,  8,  9],
        [10, 11, 12]]])

In [44]:
old_values

array([[[42, 42, 42],
        [42, 42, 42]],

       [[ 7,  8,  9],
        [10, 11, 12]]])

In [46]:
arr3d = np.array([[[1, 2, 3], [4, 5, 6]], [[7, 8, 9], [10, 11, 12]]])
old_values = arr3d.copy()
arr3d[0] = 42
arr3d

array([[[42, 42, 42],
        [42, 42, 42]],

       [[ 7,  8,  9],
        [10, 11, 12]]])

In [47]:
old_values

array([[[ 1,  2,  3],
        [ 4,  5,  6]],

       [[ 7,  8,  9],
        [10, 11, 12]]])

### 4.1.5 Boolean Indexing
<a id='415'></a>

In [55]:
names = np.array(['Bob', 'Joe', 'Will', 'Bob', 'Will', 'Joe', 'Joe'])
data = np.random.randn(7, 4)
data[names == 'Bob']

array([[ 0.58716749,  0.58244006, -2.25489352,  0.94084445],
       [ 1.42738636,  0.57198202,  1.1614776 , -0.70485413]])

In [65]:
data[names == 'Bob', 1]

array([ 0.58244006,  0.57198202])

In [62]:
data[~(names == 'Bob')]

array([[-0.81169397,  0.75090702, -0.7046266 ,  1.78082909],
       [ 0.36238525, -0.1729714 ,  1.0748155 ,  0.36789083],
       [-1.00702938, -0.78230828, -0.83362318,  0.27490664],
       [ 0.94868435,  0.33253875,  0.56029139,  1.6244712 ],
       [ 0.0500525 ,  0.94193409,  2.16835048, -0.68900437]])

In [66]:
data[~(names == 'Bob')] = 0
data

array([[ 0.58716749,  0.58244006, -2.25489352,  0.94084445],
       [ 0.        ,  0.        ,  0.        ,  0.        ],
       [ 0.        ,  0.        ,  0.        ,  0.        ],
       [ 1.42738636,  0.57198202,  1.1614776 , -0.70485413],
       [ 0.        ,  0.        ,  0.        ,  0.        ],
       [ 0.        ,  0.        ,  0.        ,  0.        ],
       [ 0.        ,  0.        ,  0.        ,  0.        ]])

### 4.1.6 Fancy Indexing
<a id="416"></a>