# Numpy tutorial

Hugh H. Liu (yinl@cs.wisc.edu)


NumPy is a library for the Python programming language, adding support for large, multi-dimensional arrays and matrices, along with a large collection of high-level mathematical functions to operate on these arrays. 

In order to use it, first you need to import the package.

In [1]:
import numpy as np

To create sequences of numbers, NumPy provides the arange function which is analogous to the Python built-in range, but returns an array.

In [2]:
a = np.arange(15)
a

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14])

reshape the array

In [3]:
b = a.reshape(3,5)
b

array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14]])

In [4]:
b.shape

(3, 5)

ndim: dimensionality of an array

In [5]:
a.ndim

1

In [6]:
b.ndim

2

In [7]:
type(b)

numpy.ndarray

The function zeros creates an array full of zeros, the function ones creates an array full of ones. By default, the dtype of the created array is float64.

In [8]:
c = np.zeros((3,4))
c

array([[0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.]])

In [9]:
np.ones( (2,3,4), dtype=np.int16 )

array([[[1, 1, 1, 1],
        [1, 1, 1, 1],
        [1, 1, 1, 1]],

       [[1, 1, 1, 1],
        [1, 1, 1, 1],
        [1, 1, 1, 1]]], dtype=int16)

When arange is used with floating point arguments, it is generally not possible to predict the number of elements obtained, due to the finite floating point precision. For this reason, it is usually better to use the function linspace that receives as an argument the number of elements that we want, instead of the step:

In [10]:
np.arange( 0, 2, 0.3 )   

array([0. , 0.3, 0.6, 0.9, 1.2, 1.5, 1.8])

In [11]:
np.linspace( 0, 2, 9 )                 # 9 numbers from 0 to 2, with step (2-0) / (9-1) = 0.25

array([0.  , 0.25, 0.5 , 0.75, 1.  , 1.25, 1.5 , 1.75, 2.  ])

## Basic operations
Arithmetic operators on arrays apply elementwise. A new array is created and filled with the result.

In [12]:
a = np.array( [20,30,40,50] )
a

array([20, 30, 40, 50])

In [13]:
b = np.arange( 4 )
b

array([0, 1, 2, 3])

In [14]:
a - b

array([20, 29, 38, 47])

In [15]:
b ** 2

array([0, 1, 4, 9])

In [16]:
10 * np.sin(a)

array([ 9.12945251, -9.88031624,  7.4511316 , -2.62374854])

In [17]:
a < 35

array([ True,  True, False, False])

### Matrix operation

In [18]:
A = np.array( [[1,1],[0,1]] )
A

array([[1, 1],
       [0, 1]])

In [19]:
B = np.array([[2,0],[3,4]])
B

array([[2, 0],
       [3, 4]])

Unlike in many matrix languages, the product operator * operates elementwise in NumPy arrays. The matrix product can be performed using the @ operator or the dot function or method:

In [20]:
A * B # elementwise

array([[2, 0],
       [0, 4]])

In [21]:
A @ B # matrix multiply

array([[5, 4],
       [3, 4]])

In [22]:
A.dot(B) # same as above

array([[5, 4],
       [3, 4]])

In [23]:
a = 3 * np.ones((2,3))
a

array([[3., 3., 3.],
       [3., 3., 3.]])

In [24]:
rg = np.random.default_rng(1)     # create instance of default random number generator
b = rg.random((2,3))
b

array([[0.51182162, 0.9504637 , 0.14415961],
       [0.94864945, 0.31183145, 0.42332645]])

Some operations, such as += and *=, act in place to modify an existing array rather than create a new one.

In [25]:
a += b
a

array([[3.51182162, 3.9504637 , 3.14415961],
       [3.94864945, 3.31183145, 3.42332645]])

3 unary operations: sum(), min(), max()

In [26]:
a.sum(), a.min(), a.max()

(21.29025228186613, 3.144159612719634, 3.950463696325935)

In [27]:
b = np.arange(12).reshape(3,4)
b

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

By default, these operations apply to the array as though it were a list of numbers, regardless of its shape. However, by specifying the axis parameter you can apply an operation along the specified axis of an array:

In [28]:
b.sum(axis=0)

array([12, 15, 18, 21])

In [29]:
b.min(axis=1)

array([0, 4, 8])

In [30]:
b.cumsum(axis=1)  # cumulative sum along each row

array([[ 0,  1,  3,  6],
       [ 4,  9, 15, 22],
       [ 8, 17, 27, 38]])

### Indexing

In [31]:
b

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

Slicing an array, e.g. below, slice b of row 1, 2 with column 1

In [32]:
b[1:3, 1]

array([5, 9])

':' means taking 'all' of this dimension

In [33]:
b[2, :]

array([ 8,  9, 10, 11])

In this case, it is equal to the last row of b:

In [34]:
b[-1]

array([ 8,  9, 10, 11])

If one wants to perform an operation on each element in the array, one can use the flat attribute which is an iterator over all the elements of the array:

In [35]:
[x for x in b.flat]

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]

### Play with the shape

In [36]:
a = np.floor(10*rg.random((3,4)))
a

array([[8., 4., 5., 0.],
       [7., 5., 3., 7.],
       [3., 4., 1., 4.]])

In [37]:
a.shape

(3, 4)

The shape of an array can be changed with various commands. Note that the following three commands all return a modified array, but do not change the original array:

ravel() returns a 1-D flattened array:

In [38]:
a.ravel()

array([8., 4., 5., 0., 7., 5., 3., 7., 3., 4., 1., 4.])

reshape() returns the reshaped array: 

In [39]:
a.reshape(6,2)  # return the reshaped array, doesn't change 'a'

array([[8., 4.],
       [5., 0.],
       [7., 5.],
       [3., 7.],
       [3., 4.],
       [1., 4.]])

T returns the transposed array:

In [40]:
a.T   # transposed

array([[8., 7., 3.],
       [4., 5., 4.],
       [5., 3., 1.],
       [0., 7., 4.]])

But a itself doesn't change after the above 3 functions

In [41]:
a

array([[8., 4., 5., 0.],
       [7., 5., 3., 7.],
       [3., 4., 1., 4.]])

resize(), on the contrary, does change 'a' itself:

In [42]:
a.resize((2,6))    # change 'a' itself
a

array([[8., 4., 5., 0., 7., 5.],
       [3., 7., 3., 4., 1., 4.]])

If a dimension is given as -1 in a reshaping operation, the other dimensions are automatically calculated

In [43]:
a.reshape(3,-1)  

array([[8., 4., 5., 0.],
       [7., 5., 3., 7.],
       [3., 4., 1., 4.]])

### Stacking together different arrays

Several arrays can be stacked together along different axes:

In [44]:
a = np.floor(10*rg.random((2,2)))
a

array([[2., 2.],
       [7., 2.]])

In [45]:
b = np.floor(10*rg.random((2,2)))
b

array([[4., 9.],
       [9., 7.]])

Vertically stack a and b (along axis 0):

In [46]:
np.vstack((a,b))   # vertically stack 

array([[2., 2.],
       [7., 2.],
       [4., 9.],
       [9., 7.]])

Horizontally stack a and b (along axis 1): 

In [47]:
np.hstack((a,b))  # horizontally stack

array([[2., 2., 4., 9.],
       [7., 2., 9., 7.]])

Note: This tutorial is borrowed from https://numpy.org/devdocs/user/quickstart.html, which you could refer to for more information.