In [1]:
import numpy as np


# Creating ndarrays

The easiest way to create an array is to use the array function. This accepts any sequence-like object (including other arrays) and produces a new NumPy array containing the passed data. For example, a list is a good candidate for conversion:


In [12]:
x = [[1, 2, 3], [4, 5, 6]]
x = np.array(object=x)
x


array([[1, 2, 3],
       [4, 5, 6]])

In [20]:
x = [[[1, 2, 3], [1, 2, 3]], [[1, 2, 3], [1, 2, 3]],
     [[1, 2, 3], [1, 2, 3]], [[1, 2, 3], [1, 2, 3]]]
x = np.array(object=x)
x


array([[[1, 2, 3],
        [1, 2, 3]],

       [[1, 2, 3],
        [1, 2, 3]],

       [[1, 2, 3],
        [1, 2, 3]],

       [[1, 2, 3],
        [1, 2, 3]]])

In [21]:
x.shape


(4, 2, 3)

In [22]:
x.ndim


3

In addition to np.array, there are a number of other functions for creating new arrays. As examples, zeros and ones create arrays of 0s or 1s, respectively, with a given length or shape. empty creates an array without initializing its values to any particular value. To create a higher dimensional array with these methods, pass a tuple for the shape:


In [23]:
x = np.zeros(shape=10)
x


array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0.])

In [25]:
x = np.ones(shape=(2, 3))
x


array([[1., 1., 1.],
       [1., 1., 1.]])

In [32]:
x = np.identity(n=4)
x


array([[1., 0., 0., 0.],
       [0., 1., 0., 0.],
       [0., 0., 1., 0.],
       [0., 0., 0., 1.]])

In [33]:
x = np.full(shape=(2, 3), fill_value=3)
x


array([[3, 3, 3],
       [3, 3, 3]])

In [26]:
x = np.arange(start=40, stop=55)
x


array([40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54])

In [27]:
x = np.linspace(start=0, stop=10, num=10)
x


array([0.        , 0.11111111, 0.22222222, 0.33333333, 0.44444444,
       0.55555556, 0.66666667, 0.77777778, 0.88888889, 1.        ])

In [28]:
x = np.random.rand(3, 2)
x


array([[0.19260286, 0.75027925],
       [0.84243   , 0.08037434],
       [0.93830969, 0.84734335]])

In [30]:
x = np.random.randn(3, 4)
x


array([[ 0.0286256 ,  1.12663327, -0.77737931,  0.33605433],
       [-0.71204291, -0.50303708,  0.08776212,  0.59927779],
       [ 1.12225172, -1.09467128,  0.37195567,  0.39207153]])

`ones_like`, `zeros_like` and `full_like` takes another array and produces a array of the same shape and dtype, fill of ones, zeros or some value


# Data Types for ndarrays

The `data` type or dtype is a special object containing the information (or metadata, data about data) the ndarray needs to interpret a chunk of memory as a particular type of data:


In [35]:
x = np.array(object=x, dtype=np.float64)
x.dtype


dtype('float64')

Don’t worry about memorizing the NumPy dtypes, especially if you’re a new user. It’s often only necessary to care about the general kind of data you’re dealing with, whether floating point, complex, integer, boolean, string, or general Python object. When you need more control over how data are stored in memory and on disk, especially large datasets, it is good to know that you have control over the storage type.


You can explicitly convert or cast an array from one dtype to another using ndarray’s astype method:


In [36]:
x = x.astype(dtype=np.float64)
x.dtype


dtype('float64')

It’s important to be cautious when using the numpy.string\_ type, as string data in NumPy is fixed size and may truncate input without warning. pandas has more intuitive out-of-the-box behavior on non-numeric data.


# Arithmetic with NumPy Arrays

Arrays are important because they enable you to express batch operations on data without writing any for loops. NumPy users call this vectorization. Any arithmetic operations between equal-size arrays applies the operation element-wise:


In [39]:
x = np.array([[1., 2., 3.], [4., 5., 6.]])
x


array([[1., 2., 3.],
       [4., 5., 6.]])

In [41]:
x * x


array([[ 1.,  4.,  9.],
       [16., 25., 36.]])

In [42]:
x - x


array([[0., 0., 0.],
       [0., 0., 0.]])

Arithmetic operations with scalars propagate the scalar argument to each element in the array:


In [40]:
x * 2


array([[ 2.,  4.,  6.],
       [ 8., 10., 12.]])

In [44]:
1 / x


array([[1.        , 0.5       , 0.33333333],
       [0.25      , 0.2       , 0.16666667]])

Comparisons between arrays of the same size yield boolean arrays:

In [45]:
y = x * 2
x > y

array([[False, False, False],
       [False, False, False]])

# Basic Indexing and Slicing
NumPy array indexing is a rich topic, as there are many ways you may want to select a subset of your data or individual elements. One-dimensional arrays are simple; on the surface they act similarly to Python lists:

In [46]:
x = np.arange(start=1, stop=10)
x

array([1, 2, 3, 4, 5, 6, 7, 8, 9])

In [47]:
x[4]

5

In [48]:
x[5:8]

array([6, 7, 8])

In [49]:
x[5:8] = 12
x

array([ 1,  2,  3,  4,  5, 12, 12, 12,  9])