# NumPy Basics: Arrays and Vectorized Computation

One of the reasons NumPy is so important for numerical computations in Python is because it is designed for efficiency on large arrays of data. To give you an idea of the performance difference, consider a NumPy array of one million integers, and the equivalent Python list:

In [6]:
import numpy as np
my_arr = np.arange(1000000)
my_list = list(range(1000000))

# Now let’s multiply each sequence by 2:
%time for _ in range(10): my_arr = my_arr * 2

CPU times: user 16.3 ms, sys: 12 µs, total: 16.4 ms
Wall time: 16.4 ms


In [5]:
%time for _ in range(10): my_list = [x * 2 for x in my_list]

CPU times: user 516 ms, sys: 109 ms, total: 625 ms
Wall time: 640 ms


## The NumPy ndarray: A Multidimensional Array Object

In [9]:
# Generate some random data
data = np.random.randn(2,3)
data

array([[ 2.4745812 , -1.2826441 , -1.74991236],
       [-1.14500153, -0.01480856, -0.45657022]])

In [14]:
data * 10

array([[ 24.74581199, -12.82644102, -17.49912359],
       [-11.45001532,  -0.14808563,  -4.56570223]])

In [16]:
data + data

array([[ 4.9491624 , -2.5652882 , -3.49982472],
       [-2.29000306, -0.02961713, -0.91314045]])

In [18]:
data.shape

(2, 3)

In [20]:
data.dtype

dtype('float64')

## Creating ndarrays

In [23]:
data1 = [6, 7.5, 8, 0, 1]
arr1 = np.array(data1)
arr1

array([6. , 7.5, 8. , 0. , 1. ])

In [27]:
data2 = [[1, 2, 3, 4], [5, 6, 7, 8]]
arr2 = np.array(data2)
arr2

array([[1, 2, 3, 4],
       [5, 6, 7, 8]])

Since data2 was a list of lists, the NumPy array arr2 has two dimensions with shape
inferred from the data. We can confirm this by inspecting the __ndim__ and __shape__
attributes:

In [28]:
arr2.ndim

2

In [29]:
arr2.shape

(2, 4)

In [32]:
arr1.dtype

dtype('float64')

In [33]:
arr2.dtype

dtype('int64')

In [36]:
np.zeros(10)

array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0.])

In [39]:
np.zeros((3,6))

array([[0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0.]])

In [48]:
np.empty((2,3,3))

array([[[0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.]],

       [[0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.]]])

In [49]:
np.arange(15)

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14])

## Data Types for ndarrays

In [63]:
arr1 = np.array([1,2,3], dtype=np.float64)
arr1.dtype

dtype('float64')

In [64]:
arr2 = np.array([1,2,3], dtype=np.int64)
arr2.dtype

dtype('int64')

In [69]:
arr = np.array([1,43,65,123,5])
arr.dtype

dtype('int64')

In [73]:
float_arr = arr.astype(np.float64)
float_arr.dtype

dtype('float64')

In [77]:
arr = np.array([3.7, -1.2, -2.6, 0.5, 12.9, 10.1])
arr

array([ 3.7, -1.2, -2.6,  0.5, 12.9, 10.1])

In [81]:
arr.astype(np.int32)

array([ 3, -1, -2,  0, 12, 10], dtype=int32)

In [86]:
numeric_strings = np.array(['3.7', '-1.2', '-2.6',  '0.5', '12.9', '10.1'], dtype=np.string_)
numeric_strings.astype(float)

array([ 3.7, -1.2, -2.6,  0.5, 12.9, 10.1])

In [89]:
int_arr = np.arange(10)
calibers = np.array([.22, .270, .357, .380, .44, .50], dtype=np.float64)
int_arr.astype(calibers.dtype)

array([0., 1., 2., 3., 4., 5., 6., 7., 8., 9.])