# NumPy Basics: Arrays and Vectorized Computation

NumPy, short for Numerical Python, is one of the most important foundational packages for numerical computing in Python. Most computational packages providing scientific functionality use NumPy's objects as the lingua franca for data exchange.

In [1]:
import numpy as np
my_arr = np.arange(1000000)
my_list = list(range(1000000))

In [2]:
time for _ in range(10): my_arr2 = my_arr * 2

Wall time: 28.1 ms


In [3]:
time for _ in range(10): my_list2 = [x * 2 for x in my_list]

Wall time: 1.41 s


NumPy-based algorithms are generally 10 to 100 times faster (or more) than their pure Python counterparts and use significantly less memory.

## The Numpy array: A Multidimensional Array Object

One of the key features of NumPy is its N-dimensional array object, or ndarray, which is a fast, flexible container for large datasets in Python. Arrays enable you to perform mathematical operations on whole blocks of data using similar syntax to the equivalent operations between scalar elements.

In [4]:
import numpy as np
# Generate some random data
data = np.random.randn(2,3)
data

array([[-1.79389887,  1.64956558,  0.41962196],
       [-1.13663658, -0.71967212,  0.04769352]])

In [5]:
data * 10

array([[-17.93898871,  16.49565581,   4.19621962],
       [-11.36636584,  -7.19672118,   0.47693518]])

In [6]:
data + data

array([[-3.58779774,  3.29913116,  0.83924392],
       [-2.27327317, -1.43934424,  0.09538704]])

In [7]:
data.shape

(2, 3)

In [8]:
data.dtype

dtype('float64')

### Creating ndarrays

The easiest way to create an array is to use the array function. This accepts any sequence-like object (including arrays) and produces a new NumPy array containing the passed data. For example, a list is a good candidante for conversion:

In [9]:
data1 = [6, 7.5, 8, 0, 1]

In [10]:
arr1 = np.array(data1)

In [11]:
arr1

array([6. , 7.5, 8. , 0. , 1. ])

In [12]:
data2 = [[1,2,3,4], [5,6,7,8]]

In [13]:
arr2 = np.array(data2)

In [14]:
arr2

array([[1, 2, 3, 4],
       [5, 6, 7, 8]])

In [15]:
arr2.ndim

2

In [16]:
arr2.shape

(2, 4)

In [17]:
arr1.dtype

dtype('float64')

In [18]:
arr2.dtype

dtype('int32')

In addition to np.array, there are a number of other functions for creating new arrays. As examples, zeros and ones create arrays of 0s and 1s, resptively, with a given length or shape. empty creates an array without initializing its values to any particular value. To create a higher dimensional array with these methods, pass a tuple for the shape.

In [19]:
np.zeros(10)

array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0.])

In [20]:
np.zeros((3,6))

array([[0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0.]])

In [21]:
np.empty((2,3,2))

array([[[1.01291322e-311, 2.81617418e-322],
        [0.00000000e+000, 0.00000000e+000],
        [8.01097888e-307, 3.76231868e+174]],

       [[6.56350982e-091, 6.58868203e-066],
        [3.61401698e+174, 9.77380164e+165],
        [6.48224660e+170, 4.93432906e+257]]])

arange is an array-valued version of the built-in Python range function:

In [22]:
np.arange(15)

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14])

### Data Types for ndarrays

In [23]:
arr1 = np.array([1,2,3], dtype = np.float64)

In [24]:
arr2 = np.array([1,2,3], dtype = np.int32)

In [25]:
arr1.dtype

dtype('float64')

In [26]:
arr2.dtype

dtype('int32')

In [27]:
arr = np.array([1,2,3,4,5])

In [28]:
arr.dtype

dtype('int32')

In [29]:
float_arr = arr.astype(np.float64) # but data will be copied

In [30]:
float_arr.dtype

dtype('float64')

In [31]:
arr = np.array([3.7, -1.2, -2.6, 0.5, 12.9, 10.1])

In [32]:
arr

array([ 3.7, -1.2, -2.6,  0.5, 12.9, 10.1])

In [33]:
arr.astype(np.int32)

array([ 3, -1, -2,  0, 12, 10])

In [34]:
numeric_strings = np.array(['1.25', '-9.6','42'], dtype=np.string_)

In [35]:
numeric_strings.astype(float)

array([ 1.25, -9.6 , 42.  ])

### Arithmetic with NumPy Arrays

Arrays are important because they enable you to express batch operations on data without writing any for loops. NumPy users call this vectorization. Any arithmetic operations between equal-size arrays applies the operations element-wise:

In [36]:
arr = np.array([[1.,2.,3.],[4.,5.,6.]])

In [37]:
arr

array([[1., 2., 3.],
       [4., 5., 6.]])

In [38]:
arr * arr

array([[ 1.,  4.,  9.],
       [16., 25., 36.]])

In [39]:
arr - arr

array([[0., 0., 0.],
       [0., 0., 0.]])

Arithmetic operations with scalars propagate the scalar argument to each element in the array:

In [40]:
1 / arr

array([[1.        , 0.5       , 0.33333333],
       [0.25      , 0.2       , 0.16666667]])

In [41]:
arr ** 0.5

array([[1.        , 1.41421356, 1.73205081],
       [2.        , 2.23606798, 2.44948974]])

In [42]:
arr2 = np.array([[0.,4.,1.],[7.,2.,12.]])

In [43]:
arr2

array([[ 0.,  4.,  1.],
       [ 7.,  2., 12.]])

In [44]:
arr2 > arr

array([[False,  True, False],
       [ True, False,  True]])

### Basic Indexing and Slicing

In [45]:
arr = np.arange(10)

In [46]:
arr

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [47]:
arr[5]

5

In [48]:
arr[5:8]

array([5, 6, 7])

In [49]:
arr[5:8] = 12

In [50]:
arr

array([ 0,  1,  2,  3,  4, 12, 12, 12,  8,  9])

In [51]:
arr_slice = arr[5:8]

In [52]:
arr_slice

array([12, 12, 12])

In [53]:
arr_slice[1] = 12345

In [54]:
arr

array([    0,     1,     2,     3,     4,    12, 12345,    12,     8,
           9])

In [55]:
arr_slice[:]=64

In [56]:
arr

array([ 0,  1,  2,  3,  4, 64, 64, 64,  8,  9])

In [57]:
arr2d = np.array([[1,2,3],[4,5,6],[7,8,9]])

In [58]:
arr2d[2]

array([7, 8, 9])

In [59]:
arr2d[0][2]

3

In [60]:
arr2d[0,2]

3

In [61]:
arr3d = np.array([[[1,2,3],[4,5,6]],[[7,8,9],[10,11,12]]])

In [62]:
arr3d[0]

array([[1, 2, 3],
       [4, 5, 6]])

In [63]:
old_values = arr3d[0].copy()

In [64]:
arr3d[0] = 42

In [65]:
arr3d

array([[[42, 42, 42],
        [42, 42, 42]],

       [[ 7,  8,  9],
        [10, 11, 12]]])

In [66]:
arr3d[0] = old_values

In [67]:
arr3d

array([[[ 1,  2,  3],
        [ 4,  5,  6]],

       [[ 7,  8,  9],
        [10, 11, 12]]])

In [68]:
arr3d[1,0]

array([7, 8, 9])

In [69]:
x= arr3d[1]

In [70]:
x

array([[ 7,  8,  9],
       [10, 11, 12]])

In [71]:
x[0]

array([7, 8, 9])

#### Indexing with slicing

In [72]:
arr

array([ 0,  1,  2,  3,  4, 64, 64, 64,  8,  9])

In [73]:
arr[1:6]

array([ 1,  2,  3,  4, 64])

In [74]:
arr2d

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

In [75]:
arr2d[:2]

array([[1, 2, 3],
       [4, 5, 6]])

In [76]:
arr2d[:2,1:]

array([[2, 3],
       [5, 6]])

In [77]:
arr2d[1,:2]

array([4, 5])

In [78]:
arr2d[:2,2]

array([3, 6])

In [79]:
arr2d[:,:1]

array([[1],
       [4],
       [7]])

In [80]:
arr2d[:2,1:]=0

In [81]:
arr2d

array([[1, 0, 0],
       [4, 0, 0],
       [7, 8, 9]])

### Boolean Indexing

In [82]:
names = np.array(['Bob','Joe','Will','Bob','Will','Joe','Joe'])

In [83]:
data = np.random.randn(7,4)

In [84]:
names

array(['Bob', 'Joe', 'Will', 'Bob', 'Will', 'Joe', 'Joe'], dtype='<U4')

In [85]:
data

array([[ 0.68233203, -1.43742462, -1.13251877, -0.8176084 ],
       [ 0.40161221, -1.13648963, -0.40277263,  0.43077432],
       [ 0.98827128, -0.87935659,  0.55098391,  0.77293233],
       [-0.30821215,  0.18688859,  0.91314749, -0.67276407],
       [-1.44691666, -0.08743517,  0.82512265,  0.3515197 ],
       [-1.57654877,  1.25592093,  0.31702085, -0.07831358],
       [-0.86019978,  0.44233694,  0.40842963,  0.21934595]])

In [86]:
names == 'Bob'

array([ True, False, False,  True, False, False, False])

In [87]:
data[names == "Bob"]

array([[ 0.68233203, -1.43742462, -1.13251877, -0.8176084 ],
       [-0.30821215,  0.18688859,  0.91314749, -0.67276407]])

In [88]:
data[names == 'Bob', 2:]

array([[-1.13251877, -0.8176084 ],
       [ 0.91314749, -0.67276407]])

In [89]:
data[names == 'Bob', 3]

array([-0.8176084 , -0.67276407])

In [90]:
names != 'Bob'

array([False,  True,  True, False,  True,  True,  True])

In [91]:
data[~(names == 'Bob')]

array([[ 0.40161221, -1.13648963, -0.40277263,  0.43077432],
       [ 0.98827128, -0.87935659,  0.55098391,  0.77293233],
       [-1.44691666, -0.08743517,  0.82512265,  0.3515197 ],
       [-1.57654877,  1.25592093,  0.31702085, -0.07831358],
       [-0.86019978,  0.44233694,  0.40842963,  0.21934595]])

In [92]:
cond = names == 'Bob'

In [93]:
data[~cond]

array([[ 0.40161221, -1.13648963, -0.40277263,  0.43077432],
       [ 0.98827128, -0.87935659,  0.55098391,  0.77293233],
       [-1.44691666, -0.08743517,  0.82512265,  0.3515197 ],
       [-1.57654877,  1.25592093,  0.31702085, -0.07831358],
       [-0.86019978,  0.44233694,  0.40842963,  0.21934595]])

In [94]:
mask = (names == "Bob") | (names == 'Will')

In [95]:
mask

array([ True, False,  True,  True,  True, False, False])

In [96]:
data[mask]

array([[ 0.68233203, -1.43742462, -1.13251877, -0.8176084 ],
       [ 0.98827128, -0.87935659,  0.55098391,  0.77293233],
       [-0.30821215,  0.18688859,  0.91314749, -0.67276407],
       [-1.44691666, -0.08743517,  0.82512265,  0.3515197 ]])

In [97]:
data[data < 0] = 0

In [98]:
data

array([[0.68233203, 0.        , 0.        , 0.        ],
       [0.40161221, 0.        , 0.        , 0.43077432],
       [0.98827128, 0.        , 0.55098391, 0.77293233],
       [0.        , 0.18688859, 0.91314749, 0.        ],
       [0.        , 0.        , 0.82512265, 0.3515197 ],
       [0.        , 1.25592093, 0.31702085, 0.        ],
       [0.        , 0.44233694, 0.40842963, 0.21934595]])

In [99]:
data[names != 'Joe'] = 7

In [100]:
data

array([[7.        , 7.        , 7.        , 7.        ],
       [0.40161221, 0.        , 0.        , 0.43077432],
       [7.        , 7.        , 7.        , 7.        ],
       [7.        , 7.        , 7.        , 7.        ],
       [7.        , 7.        , 7.        , 7.        ],
       [0.        , 1.25592093, 0.31702085, 0.        ],
       [0.        , 0.44233694, 0.40842963, 0.21934595]])

### Fancy Indexing

### Transposing Arrays and Swapping Axes

## Universal Functions: Fast Element-Wise Array Functions

## Array-Oriented Programming with Arrays

### Expressing Conditional Logic as Array Operations

### Mathematical and Statistical Methods

### Methods for Boolean Arrays

### Sorting

### Unique and Other Set Logic

## File Input and Output with Arrays

## Linear Algebra

## Pseudorandom Number Generator

## Example: Random Walk

### Simulating Many Random Walks at Once