# NumPy Cheat Sheet

The fundamental package for scientific computing with Python. It provides high-performance multidimensional array objects, and tools for working with these arrays.
NumPy is 100 times faster than iterating over lists the old school way.

[Official documentation](https://numpy.org/) | DataCamp Cheat sheet: [Web](https://www.datacamp.com/cheat-sheet/numpy-cheat-sheet-data-analysis-in-python) | [PDF](https://s3.amazonaws.com/assets.datacamp.com/blog_assets/Numpy_Python_Cheat_Sheet.pdf)


## Quick links

> You use the Navigate menu item in your browser or use the Outline feature in VS Code to jump to different locations in the Notebook. If you are viewing a static version then jump using the poor-mans version below.

- [Data types](#data-types)
- [Properties of a NumPy array](#properties-of-a-numpy-array)
- [Initialization](#initialization)
- [Information](#information)
- [Input/Output](#inputoutput)
- [Attribute inspection](#inspection)
- [Copy](#copy-arrays)
- [Sort](#sort-arrays)
- [Select](#select-from-arrays)
- [Arithmetic](#arithmetic-operations)
- [Comparisons](#comparisons)
- [Manipulate](#manipulate-arrays)
- [Vectorized operations](#vectorized-operations)

In [176]:
import numpy as np


## Data types

This section explores the different data types available in `numpy`.
Reference to the official documentation for [basic data types](https://numpy.org/doc/stable/user/basics.types.html)


In [177]:
np.int0  # Signed integer types
np.int32  # Signed 32-bit integer types
np.int64  # Signed 64-bit integer types


numpy.int64

In [178]:
np.float16   # half-precision floating-point number: sign bit, 5 bits exponent, 10 bits mantissa
np.float32   # single-precision floating-point number: sign bit, 8 bits exponent, 23 bits mantissa
np.float64   # double-precision floating-point number: sign bit, 11 bits exponent, 52 bits mantissa


numpy.float64

In [179]:
# `np.complex` is a deprecated alias for the builtin `complex`.
# To silence this warning, use `complex` by itself.
# Doing this will not modify any behavior and is safe.
# If you specifically wanted the numpy scalar type, use `np.complex128` here.
complex
np.complex128


numpy.complex128

In [180]:
# `np.bool` is a deprecated alias for the builtin `bool`.
# To silence this warning, use `bool` by itself.
# Doing this will not modify any behavior and is safe.
# If you specifically wanted the numpy scalar type, use `np.bool_` here.
bool
np.bool_


numpy.bool_

In [181]:
# `np.object` is a deprecated alias for the builtin `object`.
# To silence this warning, use `object` by itself.
# Doing this will not modify any behavior and is safe.
object


object

In [182]:
# Fixed length string type.
np.string_


numpy.bytes_

In [183]:
# Fixed length unicode type.
np.unicode_


numpy.str_

## Properties of a numpy array

- They must be homogenous
- They are of fixed size
- The dimensions are known at creation time

Its limitations are that there are no column names and that we are restricted to only one data type.


In [184]:
a = np.array([1, 2, 3])
a


array([1, 2, 3])

In [185]:
b = np.array([(1.5, 2, 3), (4, 5, 6)], dtype=float)
b


array([[1.5, 2. , 3. ],
       [4. , 5. , 6. ]])

In [186]:
c = np.array([[(1.5, 2, 3), (4, 5, 6)], [(3, 2, 1), (4, 5, 6)]], dtype=float)
c


array([[[1.5, 2. , 3. ],
        [4. , 5. , 6. ]],

       [[3. , 2. , 1. ],
        [4. , 5. , 6. ]]])

## Initialization

You can create arrays with initial placeholder values with the following methods.


In [187]:
# Initialize a numpy array with zeros
# data type is float by default
# 3 rows and 4 columns
np.zeros((3, 4))


array([[0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.]])

In [188]:
# Initialize a numpy array with ones
# 2 rows and 3 columns
# dtype can be specified
np.ones((2, 3), dtype=np.int16)


array([[1, 1, 1],
       [1, 1, 1]], dtype=int16)

In [189]:
# Create an array of evenly spaced values (step value)
# Start at 10, end at 100, step by 15
np.arange(10, 100, 15)


array([10, 25, 40, 55, 70, 85])

In [190]:
# Create an array of evenly spaced values (number of samples)
# Start at 0, end at 100, 10 samples
np.linspace(0, 100, 10, dtype=np.int16)


array([  0,  11,  22,  33,  44,  55,  66,  77,  88, 100], dtype=int16)

In [191]:
# Create an array of evenly spaced values (number of samples)
# Start at 0, end at 1, 50 samples as float
np.linspace(0, 1, 50, dtype=np.float64)


array([0.        , 0.02040816, 0.04081633, 0.06122449, 0.08163265,
       0.10204082, 0.12244898, 0.14285714, 0.16326531, 0.18367347,
       0.20408163, 0.2244898 , 0.24489796, 0.26530612, 0.28571429,
       0.30612245, 0.32653061, 0.34693878, 0.36734694, 0.3877551 ,
       0.40816327, 0.42857143, 0.44897959, 0.46938776, 0.48979592,
       0.51020408, 0.53061224, 0.55102041, 0.57142857, 0.59183673,
       0.6122449 , 0.63265306, 0.65306122, 0.67346939, 0.69387755,
       0.71428571, 0.73469388, 0.75510204, 0.7755102 , 0.79591837,
       0.81632653, 0.83673469, 0.85714286, 0.87755102, 0.89795918,
       0.91836735, 0.93877551, 0.95918367, 0.97959184, 1.        ])

In [192]:
# Create a constant array
# 2 rows and 3 columns
# Fill with 42
np.full((2, 3), 42)


array([[42, 42, 42],
       [42, 42, 42]])

In [193]:
# Create a 3x3 identity matrix
# Identity matrix is a square matrix with 1s on the main diagonal
np.eye(3)


array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]])

In [194]:
# Create an array with random values
# 2 rows and 3 columns
np.random.random((2, 3))


array([[0.768813  , 0.49125213, 0.83011266],
       [0.33902167, 0.69719361, 0.95107229]])

In [195]:
# Create an array of random integers
# From 0 to 10
# 2 rows and 3 columns
np.random.randint(0, 10, (2, 3))


array([[1, 5, 2],
       [2, 1, 3]])

In [196]:
# Create an empty array
# Uninitialized, output may vary
# 3 rows and 2 columns
# dtype is float by default
np.empty((3, 2))


array([[0.768813  , 0.49125213],
       [0.83011266, 0.33902167],
       [0.69719361, 0.95107229]])

## Information

Get help information for a function, class, or module.


In [197]:
# Get information about the np.ndarray.dtype
np.info(np.ndarray.dtype)


Data-type of the array's elements.


    Setting ``arr.dtype`` is discouraged and may be deprecated in the
    future.  Setting will replace the ``dtype`` without modifying the
    memory (see also `ndarray.view` and `ndarray.astype`).

Parameters
----------
None

Returns
-------
d : numpy dtype object

See Also
--------
ndarray.astype : Cast the values contained in the array to a new data-type.
ndarray.view : Create a view of the same data but a different data-type.
numpy.dtype

Examples
--------
>>> x
array([[0, 1],
       [2, 3]])
>>> x.dtype
dtype('int32')
>>> type(x.dtype)
<type 'numpy.dtype'>


## Input/Output

You can read from and write to different types of files. This section explores the numerous ways.


### Working with binary files on disk


In [198]:
# Save an array to a binary file in NumPy .npy format to disk.
# Here we create and save a 3x3 identity matrix in a file named identity_matrix.npy
np.save('data/identity_matrix', np.eye(3))

# Load the data from the file
np.load('data/identity_matrix.npy')


array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]])

In [199]:
# Create arrays a and b
# Save several arrays into a single file in uncompressed .npz format.
# Provide arrays as keyword arguments to store them under the corresponding name in the output file: savez(fn, x=x, y=y).
# If arrays are specified as positional arguments, i.e., savez(fn, x, y), their names will be arr_0, arr_1, etc.
a = np.array([1, 2, 3])
b = np.array([(1.5, 2, 3), (4, 5, 6)], dtype=float)
np.savez('data/a_and_b.npz', a=a, b=b)

# Load the data from the file targeting the variable a
np.load('data/a_and_b.npz')['a']


array([1, 2, 3])

In [200]:
# Load arrays or pickled objects from .npy, .npz or pickled files.
np.load('data/identity_matrix.npy')


array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]])

### Working with text files


In [201]:
# Load data from a text file.
# Each row in the text file must have the same number of values.
np.loadtxt('data/text.txt', delimiter=',')


array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]])

In [202]:
# Load data from a text file, with missing values handled as specified.
# Each line past the first skip_header lines is split at the delimiter character, and characters following the comments character are discarded.
np.genfromtxt('data/csv.csv', delimiter=',')


array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]])

In [203]:
# Save an array to a text file.
np.savetxt('data/save.txt', np.array([[1, 2], [3, 4]]), delimiter=',')
np.loadtxt('data/save.txt', delimiter=',')


array([[1., 2.],
       [3., 4.]])

## Inspection

This section dives into the different attributes (or properties) of an array.


In [204]:
# Return a tuple of array dimensions.
# Rows and columns.
np.array([[1, 2], [3, 4], [5, 6]]).shape


(3, 2)

In [205]:
# Return the number of dimensions of the array.
np.array([[1, 2], [3, 4], [5, 6], [7, 8], [9, 10]]).ndim


2

In [206]:
# Return the number of elements in an array.
np.array([[1, 2], [3, 4], [5, 6], [7, 8], [9, 10]]).size


10

In [207]:
# Return the number of arrays in a container.
len(np.array([[1, 2], [3, 4], [5, 6], [7, 8], [9, 10]]))


5

In [208]:
# Return the data type of an array.
# Remember that ndarrays are homogenous so they only have one data type.
np.array([[1, 2], [3, 4]]).dtype


dtype('int64')

In [209]:
# Return the data type of an array in a string format.
np.array([[1, 2], [3, 4]]).dtype.name


'int64'

In [210]:
# Return an array where each element is converted to a different data type.
np.array([[1, 2], [3, 4]]).astype(str)


array([['1', '2'],
       ['3', '4']], dtype='<U21')

In [211]:
# Return the size in bytes of each element of an array.
np.array([[1, 2], [3, 4]]).itemsize


8

In [212]:
# Return the size in bytes of the whole array.
np.array([[1, 2], [3, 4]]).nbytes


32

## Copy arrays


In [213]:
# Create a view of the array with the same data.
a = np.array([[1, 2],
              [3, 4],
              [5, 6],
              [7, 8],
              [9, 10]])
a.view()


array([[ 1,  2],
       [ 3,  4],
       [ 5,  6],
       [ 7,  8],
       [ 9, 10]])

In [214]:
# Return an array copy of the given object.
# The copy owns the data and any changes to the copy will not affect the original array, and vice versa.
a = np.array([[1, 2],
              [3, 4],
              [5, 6],
              [7, 8],
              [9, 10]])
np.copy(a)


array([[ 1,  2],
       [ 3,  4],
       [ 5,  6],
       [ 7,  8],
       [ 9, 10]])

In [215]:
# Create a deep copy of the array.
# The deep copy doesn't share anything with the original array.
a = np.array([[1, 2],
              [3, 4],
              [5, 6],
              [7, 8],
              [9, 10]])
a.copy()


array([[ 1,  2],
       [ 3,  4],
       [ 5,  6],
       [ 7,  8],
       [ 9, 10]])

## Sort arrays


In [216]:
# Sorts an array in place.
# The array is sorted in ascending order by default.
# The axis along which to sort is optional.
# If axis is not specified, sorted by the horizontal axis in a 2d array.
a = np.array([[5, 1, 2],
              [6, 9, 10],
              [5, 5, 6],
              [9, 3, 4],
              [0, 7, 8]])
a.sort()
a


array([[ 1,  2,  5],
       [ 6,  9, 10],
       [ 5,  5,  6],
       [ 3,  4,  9],
       [ 0,  7,  8]])

In [217]:
# Sorts an array in place.
# Axis 0 is the vertical axis in a 2d array.
a = np.array([[5, 1, 2],
              [6, 9, 10],
              [5, 5, 6],
              [9, 3, 4],
              [0, 7, 8]])
a.sort(axis=0)
a


array([[ 0,  1,  2],
       [ 5,  3,  4],
       [ 5,  5,  6],
       [ 6,  7,  8],
       [ 9,  9, 10]])

## Select from arrays


### Get a subset of data

Get a new array that contains a subset of the elements of the original array.
The subset can be created by selecting specific indices or by using a boolean mask to select elements that satisfy a certain condition.


In [218]:
# Select the element at a given index.
a = np.array([[1,  2,  5],
              [6,  9, 10],
              [5,  5,  6],
              [3,  4,  9],
              [0,  7,  8]])
a[2]


array([5, 5, 6])

In [219]:
# Select the element at a given row and column.
# Equivalent to a[3][1]
a = np.array([[1,  2,  5],
              [6,  9, 10],
              [5,  5,  6],
              [3,  4,  9],
              [0,  7,  8]])
a[3, 1]


4

In [220]:
# Select elements using boolean indexing.
# Return a new array of all elements that are greater than 5.
a = np.array([[1,  2,  5],
              [6,  9, 10],
              [5,  5,  6],
              [3,  4,  9],
              [0,  7,  8]])
a[a > 5]


array([ 6,  9, 10,  6,  9,  7,  8])

### Slice data

Get parts or entire rows or columns of a given array.

`ndarray[start:stop:step]`


In [221]:
a = np.array([[1,  2,  3],
              [4,  5,  6],
              [7,  8,  9]])
# Select items at index 0 and 1.
a[0:2]


array([[1, 2, 3],
       [4, 5, 6]])

In [222]:
a = np.array([[1,  2,  3],
              [4,  5,  6],
              [7,  8,  9]])
# Select items at rows index 0 and index 1 in column index 1.
a[0:2, 1]


array([2, 5])

In [223]:
a = np.array([[1,  2,  3],
              [4,  5,  6],
              [7,  8,  9]])
# Select all items at row index 2.
a[2:]


array([[7, 8, 9]])

In [224]:
a = np.array([[1,  2,  3],
              [4,  5,  6],
              [7,  8,  9]])
# Select all items before row index 2.
a[:2]


array([[1, 2, 3],
       [4, 5, 6]])

In [225]:
a = np.array([[1,  2,  3],
              [4,  5,  6],
              [7,  8,  9]])
# Select the whole of row index 1.
a[1, ...]


array([4, 5, 6])

In [226]:
a = np.array([[1,  2,  3],
              [4,  5,  6],
              [7,  8,  9]])
# Select the whole of column index 1.
a[..., 1]


array([2, 5, 8])

In [227]:
a = np.array([[1,  2,  3],
              [4,  5,  6],
              [7,  8,  9]])
# Reverses the array. 💥
a[:: -1]


array([[7, 8, 9],
       [4, 5, 6],
       [1, 2, 3]])

### Filter data


In [228]:
a = np.array([[1,  2,  3],
              [4,  5,  6],
              [7,  8,  9]])
# Returns where either or both of the operands are True.
a[(a > 7) | (a < 3)]


array([1, 2, 8, 9])

In [229]:
a = np.array([[1,  2,  3],
              [4,  5,  6],
              [7,  8,  9]])
# Returns where both operands are True.
a[(a > 3) & (a < 8)]


array([4, 5, 6, 7])

In [230]:
a = np.array([[1,  2,  3],
              [4,  5,  6],
              [7,  8,  9]])
# Returns all even numbers.
a[a % 2 == 0]


array([2, 4, 6, 8])

## Arithmetic operations


### Basic operations


In [231]:
# Addition.
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])
a + b


array([5, 7, 9])

In [232]:
# Addition.
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])
np.add(b, a)


array([5, 7, 9])

In [233]:
# Subtraction.
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])
b - a


array([3, 3, 3])

In [234]:
# Subtraction.
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])
np.subtract(b, a)


array([3, 3, 3])

In [235]:
# Multiplication.
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])
a * b


array([ 4, 10, 18])

In [236]:
# Multiplication.
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])
np.multiply(a, b)


array([ 4, 10, 18])

In [237]:
# Division.
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])
b / a


array([4. , 2.5, 2. ])

In [238]:
# Division.
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])
np.divide(b, a)


array([4. , 2.5, 2. ])

In [239]:
# Exponential.
a = np.array([1, 2, 3])
np.exp(a)


array([ 2.71828183,  7.3890561 , 20.08553692])

In [240]:
# Square root.
a = np.array([9, 16, 25])
np.sqrt(a)


array([3., 4., 5.])

### Trigonometric calculations


In [241]:
# Element-wise sine.
a = np.array([1, 2, 3])
np.sin(a)


array([0.84147098, 0.90929743, 0.14112001])

In [242]:
# Element-wise cosine.
a = np.array([1, 2, 3])
np.cos(a)


array([ 0.54030231, -0.41614684, -0.9899925 ])

In [243]:
# Element-wise natural logarithm.
a = np.array([1, 2, 3])
np.log(a)


array([0.        , 0.69314718, 1.09861229])

In [244]:
# Dot product of entire array.
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])
a.dot(b)


32

## Comparisons


In [245]:
# Element-wise equality comparison.
a = np.array([1, 2, 3, 4])
b = np.array([4, 5, 6, 4])
a == b  # Also: !=


array([False, False, False,  True])

In [246]:
# Element-wise less-than comparison.
a = np.array([1, 2, 3, 4])
b = np.array([4, 5, 6, 4])
a < b  # Also: > <= >=


array([ True,  True,  True, False])

In [247]:
# Array-wise equality comparison.
# Returns True if two arrays have the same shape and elements, False otherwise.
a = np.array([1, 2, 3, 4])
b = np.array([4, 3, 2, 1])
np.array_equal(a, b)


False

In [248]:
# Array-wise comparison.
# Returns True if input arrays are shape consistent and all elements equal.
# Shape consistent means they are either the same shape, or one input array can be broadcasted to create the same shape as the other one
# 🤔 Interesting.
a = np.array([1, 2, 3, 4])
b = np.array([[1, 2, 3, 4], [1, 2, 3, 4]])
np.array_equiv(a, b)


True

## Manipulate arrays


In [249]:
a = np.array([[1,  2,  3],
              [4,  5,  6],
              [7,  8,  9]])
# Switch rows with columns.
np.transpose(a)


array([[1, 4, 7],
       [2, 5, 8],
       [3, 6, 9]])

In [250]:
a = np.array([[1,  2,  3],
              [4,  5,  6],
              [7,  8,  9]])
# Flatten the array.
np.ravel(a)


array([1, 2, 3, 4, 5, 6, 7, 8, 9])

In [251]:
a = np.array([[1,  2,  3],
              [4,  5,  6],
              [7,  8,  9]])
# Resizes the array to 1 x 2.
a.resize((1, 2))
a


array([[1, 2]])

In [252]:
a = np.array([[1,  2,  3],
              [4,  5,  6],
              [7,  8,  9]])
# Appends values to the end of an array, flattening the array.
b = np.append(a, [1, 2, 3])
b


array([1, 2, 3, 4, 5, 6, 7, 8, 9, 1, 2, 3])

In [253]:
a = np.array([[1,  2,  3],
              [4,  5,  6],
              [7,  8,  9]])
# Inserts values at a given index into an array, flattening the array.
b = np.insert(a, [1, 2, 3], 1)
b


array([1, 1, 2, 1, 3, 1, 4, 5, 6, 7, 8, 9])

In [254]:
a = np.array([[1,  2,  3],
              [4,  5,  6],
              [7,  8,  9]])
# Inserts values at a given index into an array, flattening the array.
b = np.delete(a, [[1, 2, 3]])
b


array([1, 5, 6, 7, 8, 9])

### Combine arrays


In [255]:
# Join a sequence of arrays along an existing axis.
np.concatenate(([1, 2, 3], [4, 5, 6]))


array([1, 2, 3, 4, 5, 6])

In [256]:
# Stack arrays in sequence vertically (row wise).
# This is equivalent to concatenation along the first axis after 1-D arrays of shape (N,) have been reshaped to (1,N). Rebuilds arrays divided by vsplit.
# This function makes most sense for arrays with up to 3 dimensions. For instance, for pixel-data with a height (first axis), width (second axis), and r/g/b channels (third axis).
# The functions concatenate, stack and block provide more general stacking and concatenation operations.
np.vstack(([1, 2, 3], [4, 5, 6]))


array([[1, 2, 3],
       [4, 5, 6]])

In [257]:
# Stack arrays in sequence horizontally (column wise).
# This is equivalent to concatenation along the second axis, except for 1-D arrays where it concatenates along the first axis. Rebuilds arrays divided by hsplit.
# This function makes most sense for arrays with up to 3 dimensions. For instance, for pixel-data with a height (first axis), width (second axis), and r/g/b channels (third axis).
# The functions concatenate, stack and block provide more general stacking and concatenation operations.
np.hstack(([1, 2, 3], [4, 5, 6]))


array([1, 2, 3, 4, 5, 6])

In [258]:
# Stack 1-D arrays as columns into a 2-D array.
# Take a sequence of 1-D arrays and stack them as columns to make a single 2-D array.
# 2-D arrays are stacked as-is, just like with hstack. 1-D arrays are turned into 2-D columns first.
np.column_stack(([1, 2, 3], [4, 5, 6]))


array([[1, 4],
       [2, 5],
       [3, 6]])

### Split arrays


In [259]:
# Split an array into multiple sub-arrays horizontally (column-wise).
# Please refer to the split documentation.
# hsplit is equivalent to split with axis=1, the array is always split along the second axis except for 1-D arrays, where it is split at axis=0.
np.hsplit(np.array([1, 2, 3, 4, 5, 6]), 3)


[array([1, 2]), array([3, 4]), array([5, 6])]

In [260]:
# Split an array into multiple sub-arrays vertically (row-wise).
# It only works on arrays of 2 or more dimensions.
# Please refer to the split documentation.
# vsplit is equivalent to split with axis=0 (default), the array is always split along the first axis regardless of the array dimension.
np.vsplit(np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]]), 1)


[array([[1, 2, 3],
        [4, 5, 6],
        [7, 8, 9]])]

## Vectorized operations


![Axes](attachment:image.png)

In [261]:
# Sum of array elements over a axis 0 because it is a one dimensional array.
np.sum(np.array([1, 2, 3]))

6

In [262]:
# Sum of array elements over the vertical axis because the axis is 0.
np.sum(np.array([[1, 2, 3], 
                 [4, 5, 6]]), axis=0)

array([5, 7, 9])

In [263]:
# Sum of array elements over the horizontal axis because the axis is 1.
np.sum(np.array([[1, 2, 3], 
                 [4, 5, 6]]), axis=1)

array([ 6, 15])