# Basics of NumPy



The following topics are covered in this tutorial:

- Construction
- Attributes
- Indexing
- Slicing
- Combined indexing & slicing
- Views & copies
- Reshaping
- Combining and splitting


# Setup NumPy


In [1]:
import numpy as np
print('numpy version: ', np.__version__)
np.random.seed(124)

numpy version:  1.21.6


# Construction


A NumPy array can simply be constructed from a Python list that provides values.

In [2]:
a0 = np.array([0,1,2,3])
print(a0)

[0 1 2 3]


In [3]:
a00 = np.array([[0,1,2,3], [1,2,3,4], [2,3,1,4]])
print(a00)

[[0 1 2 3]
 [1 2 3 4]
 [2 3 1 4]]


In [4]:
a1 = np.random.randn(100000)
print(a1)

[ 0.28847906 -0.46295408 -1.33800442 ...  0.7165962   0.04406446
 -0.22293982]


Uniform

In [5]:
a2 = np.random.uniform(0,20,(3,5))
print(a2)

[[12.21381579 17.64691537  1.92394764 12.32099227 14.19329493]
 [ 5.61746372  6.81516265 10.76025264 16.32530654  9.55802716]
 [18.82677797 19.38739233 19.70009127 10.35937077  2.93379732]]


In [6]:
a7 = np.random.uniform(0,80,(6,2,7,5,9,2,3))

There are also convenient ways to construct arrays with all zeros, ones, or given a value with the **zeros()**, **ones()**, and **full()** functions. Pass the shape of the array as a tuple as first argument.

In [7]:
z = np.zeros((2, 5))
print(z)
print()
o = np.ones((3, 4, 2))
print(o)
print()
n = np.full((3, 6), 8)
print(n)

[[0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]]

[[[1. 1.]
  [1. 1.]
  [1. 1.]
  [1. 1.]]

 [[1. 1.]
  [1. 1.]
  [1. 1.]
  [1. 1.]]

 [[1. 1.]
  [1. 1.]
  [1. 1.]
  [1. 1.]]]

[[8 8 8 8 8 8]
 [8 8 8 8 8 8]
 [8 8 8 8 8 8]]


Eye

In [8]:
e = np.eye(5)

For all the above construction functions, you can define the data type of the array with the named **dtype** argument.

Common data types are **int64** (integer), **uint64** (unsigned integer), **float64**, **complex64**, where 64 refers to a 64-bit version. There are also variations for 8, 16, 32, and in the case of complex 128 bits.

In [9]:
u1 = np.full((2,3),9, dtype = np.uint64)

## Attributes



The attributer **ndim** gives the number of dimensions.

In [10]:
print(a7.size)

22680


Shape

In [11]:
print(a7.shape)

(6, 2, 7, 5, 9, 2, 3)


In [12]:
type(a1.shape)

tuple

Type

In [13]:
type(a7)

numpy.ndarray

dtpe


In [14]:
print(a1.dtype)

float64


The attribute **size** holds the total number of elements in the array. 

In [15]:
print(a7.size)

22680


In [16]:
print(a7.nbytes)

181440


# Indexing



In [17]:
print(a1[6])

-1.0175213629626445


In [18]:
for i in range(10):
    print(f'a1[i] : {a1[i]}')

a1[i] : 0.2884790609311843
a1[i] : -0.46295408222658696
a1[i] : -1.338004420633611
a1[i] : 2.317015672344602
a1[i] : -1.4673759339812118
a1[i] : -0.748547692135884
a1[i] : -1.0175213629626445
a1[i] : 1.6350668010249894
a1[i] : 0.9225456123685942
a1[i] : -0.719881324562294


In [19]:
print(a0)
print(' ')
print(a0[-4])

[0 1 2 3]
 
0


In [20]:
print(a2)
print(' ')
print(a2[0:2,2])

[[12.21381579 17.64691537  1.92394764 12.32099227 14.19329493]
 [ 5.61746372  6.81516265 10.76025264 16.32530654  9.55802716]
 [18.82677797 19.38739233 19.70009127 10.35937077  2.93379732]]
 
[ 1.92394764 10.76025264]


In [21]:
a2[1,2] = 2.34
print(a2)

[[12.21381579 17.64691537  1.92394764 12.32099227 14.19329493]
 [ 5.61746372  6.81516265  2.34       16.32530654  9.55802716]
 [18.82677797 19.38739233 19.70009127 10.35937077  2.93379732]]


In [22]:
print(a0)
a0[2] = 7.56
# the decimals are not stored as the floating point number is
# first converted to an integer number, loosing the decimals
# in the process
print(a0)

[0 1 2 3]
[0 1 7 3]


# Slicing

In [90]:
r10 = np.random.randint(10, size=(10))
print(r10)
print(r10[0:5:1])
print(r10[2:8:1])
print(r10[0:10:2])

[4 4 4 6 3 4 8 9 1 3]
[4 4 4 6 3]
[4 6 3 4 8 9]
[4 4 3 8 1]


In [24]:
print(r10[:5:])
print(r10[2:8:])
print(r10[::2])

[2 5 5 5 1]
[5 5 1 1 8 7]
[2 5 1 8 4]


In [25]:
print(r10[::-1])
print(r10[8::-1])

[6 4 7 8 1 1 5 5 5 2]
[4 7 8 1 1 5 5 5 2]


In [26]:
 i12 = np.array([[8, 2, 5, 3, 6], [9, 4, 0, 2, 8], [7, 6, 1, 9, 3]])
print(i12)
print()
print(i12[0:2:1, 2:5:1])

[[8 2 5 3 6]
 [9 4 0 2 8]
 [7 6 1 9 3]]

[[5 3 6]
 [0 2 8]]


In [27]:
 # same as above
print(i12[:2:, 2::])

[[5 3 6]
 [0 2 8]]


In [28]:
print(i12[1::, ::-2])
print()
print(i12[1:, ::-2])

[[8 0 9]
 [3 1 7]]

[[8 0 9]
 [3 1 7]]


In [29]:
print(i12[:2, 2:])

[[5 3 6]
 [0 2 8]]


In [30]:
print(i12[0:2, :])

[[8 2 5 3 6]
 [9 4 0 2 8]]


In [31]:
print(i12[1::, ...])

[[9 4 0 2 8]
 [7 6 1 9 3]]


In [32]:
 # Remember the shape of a7 is (6, 2, 7, 5, 9, 2, 3)
print(a7[0:3:, ...].shape)
print(a7[0:3:, 0:1:, ..., 0:3:, 0:1:2].shape)

(3, 2, 7, 5, 9, 2, 3)
(3, 1, 7, 5, 9, 2, 1)


In [33]:
 print(np.squeeze(a7[0:3:, 0:1:, ..., 0:3:, 0:1:2]).shape)

(3, 7, 5, 9, 2)


In [34]:
print(np.squeeze(a7[0:3:, 0:1:, ..., 0:3:, 0:1:2]).size)
print(a7[0:3:, 0:1:, ..., 0:3:, 0:1:2].size)

1890
1890


# Combined Indexing & Slicing

In [35]:
print(i12)
print()
print(i12[1, 1:4])

[[8 2 5 3 6]
 [9 4 0 2 8]
 [7 6 1 9 3]]

[4 0 2]


In [36]:
 print(i12.shape)
print(i12[1, 1:4].shape)

(3, 5)
(3,)


In [37]:
i3D = np.random.randint(0, 10,(2, 4, 5))
print(i3D.shape)
print()
print(i3D)

(2, 4, 5)

[[[1 7 3 5 6]
  [9 2 6 9 9]
  [8 3 2 5 9]
  [8 3 0 3 6]]

 [[5 9 4 7 7]
  [1 7 9 5 9]
  [2 1 8 4 4]
  [9 1 1 0 9]]]


In [38]:
print(i3D[:, 1:3, :].shape)
print()
print(i3D[:, 1:3, :])

(2, 2, 5)

[[[9 2 6 9 9]
  [8 3 2 5 9]]

 [[1 7 9 5 9]
  [2 1 8 4 4]]]


In [39]:
 print(i3D[::, 1:2, ::].shape)
print()
print(i3D[::, 1:2, ::])

(2, 1, 5)

[[[9 2 6 9 9]]

 [[1 7 9 5 9]]]


In [40]:
print(i3D)
print(i3D[::, 1, ::].shape)
print()
print(i3D[::, 1, ::])

[[[1 7 3 5 6]
  [9 2 6 9 9]
  [8 3 2 5 9]
  [8 3 0 3 6]]

 [[5 9 4 7 7]
  [1 7 9 5 9]
  [2 1 8 4 4]
  [9 1 1 0 9]]]
(2, 5)

[[9 2 6 9 9]
 [1 7 9 5 9]]


In [41]:
# all of row 1
print(i12[1, :])
# all of column 4
print(i12[:, 4])

[9 4 0 2 8]
[6 8 3]


In [42]:
 # all of column 4
print(i12[:, 4:5])

[[6]
 [8]
 [3]]


# Views & Copies

In [43]:
print(i12)
print()
i12_slice = i12[1:3, 1:4]
print(i12_slice)

[[8 2 5 3 6]
 [9 4 0 2 8]
 [7 6 1 9 3]]

[[4 0 2]
 [6 1 9]]


In [44]:
i12_slice[0,1] = 8
print(i12_slice)
print(' ')
print(i12)

[[4 8 2]
 [6 1 9]]
 
[[8 2 5 3 6]
 [9 4 8 2 8]
 [7 6 1 9 3]]


In [45]:
i12[1:3, 1:4] = 3
print(i12)

[[8 2 5 3 6]
 [9 3 3 3 8]
 [7 3 3 3 3]]


In [46]:
i12_slice_copy = i12[0:2, 0:3].copy()
print(i12_slice_copy)

[[8 2 5]
 [9 3 3]]


In [47]:
i12_slice_copy[:, :] = 5
print(i12_slice_copy)
print()
print(i12)

[[5 5 5]
 [5 5 5]]

[[8 2 5 3 6]
 [9 3 3 3 8]
 [7 3 3 3 3]]


# Reshaping

In [48]:
f16 = np.arange(16.0, dtype=np.float64)
print(f16)
# f16 is a Numpy array
print(type(f16))
# that stores values of data type (64-bit) floating point
print(f16.dtype)

[ 0.  1.  2.  3.  4.  5.  6.  7.  8.  9. 10. 11. 12. 13. 14. 15.]
<class 'numpy.ndarray'>
float64


In [49]:
f16_reshaped = f16.reshape((4, 4))
print(f16_reshaped)

[[ 0.  1.  2.  3.]
 [ 4.  5.  6.  7.]
 [ 8.  9. 10. 11.]
 [12. 13. 14. 15.]]


In [50]:
f4 = np.arange(4, dtype=np.float64)
# row matrix
print(f4[np.newaxis, :].shape)
print()
# column matrix
print(f4[:, np.newaxis].shape)

(1, 4)

(4, 1)


# Combining and Splitting


In [51]:
 x = np.random.randint(0, 9, (3))
y = np.random.randint(0, 9, (3))
z = np.random.randint(0, 9, (3))
r = np.concatenate([x, y, z])
print(x)
print(y)
print(z)
print()
print(r)

[2 2 1]
[7 5 1]
[7 8 7]

[2 2 1 7 5 1 7 8 7]


For multidimensional arrays, the dimension to concatenate by can be specified by the **axis** argument. (Be aware that the numbering of axis starts with 0 and that the axis must exist. You cannot increase the number of dimensions of the arrays with this function, just the size of the dimensions.)

In [52]:
x2 = np.random.randint(0, 9, (2, 4))
y2 = np.random.randint(0, 9, (2, 4))
r1 = np.concatenate([x2, y2], axis=0)
r2 = np.concatenate([x2, y2], axis=1)
print(x2)
print()
print(y2)
print()
print(r1)
print()
print(r2)

[[4 6 8 0]
 [6 1 1 2]]

[[7 3 8 4]
 [6 8 4 7]]

[[4 6 8 0]
 [6 1 1 2]
 [7 3 8 4]
 [6 8 4 7]]

[[4 6 8 0 7 3 8 4]
 [6 1 1 2 6 8 4 7]]


You can also number the axis from the back using negative axis indices. The following is equivalent to the above. 

In [53]:
r1 = np.concatenate([x2, y2], axis=-2)
r2 = np.concatenate([x2, y2], axis=-1)
print(x2)
print()
print(y2)
print()
print(r1)
print()
print(r2)

[[4 6 8 0]
 [6 1 1 2]]

[[7 3 8 4]
 [6 8 4 7]]

[[4 6 8 0]
 [6 1 1 2]
 [7 3 8 4]
 [6 8 4 7]]

[[4 6 8 0 7 3 8 4]
 [6 1 1 2 6 8 4 7]]


The axis index -1 is particularly useful as one often wants to concatenate by the last axis. And by passing the axis argument -1 (instead of a positive number), it is not necessary to keep track of how many dimensions the array has: it is the last dimension, no matter how many dimensions the input array has. For example, if it happens that the array shape needs to be changed (by increasing or decreasing the number of dimensions), because you added some further code at the beginning of your program, then the index -1 still refers to the last dimension. No need to correct it.

The function **vstack()** stacks arrays vertically, meaning it concantenates arrays row wise. 

In [54]:
 r = np.vstack([x, y, z])
print(r)

[[2 2 1]
 [7 5 1]
 [7 8 7]]


Note the difference with regard to the shape of the resulting arrays between concatenate and vstack. Concatenate joins the two arrays along the existing axis, while vstack introduces a new axis.

In [55]:
 print(np.concatenate([x, y, z]).shape)
print(np.vstack([x, y, z]).shape)

(9,)
(3, 3)


A new axis is, however, only introduced, if the respective axis does not already exist. If we stack along the horizontal direction with the function **hstack()**, which is column wise, then no new axis needs to be introduced, even for the 1-dimensional case.

In [56]:
 r = np.hstack([x, y, z])
print(r.shape)
print(r)

(9,)
[2 2 1 7 5 1 7 8 7]


For 2-dimensional arrays, both vstack and hstack do not introduced new dimensions, since this is not necessary. So the result is the same as with concatenate.

In [57]:
print(np.vstack([x2, y2]))
print()
print(np.hstack([x2, y2]))

[[4 6 8 0]
 [6 1 1 2]
 [7 3 8 4]
 [6 8 4 7]]

[[4 6 8 0 7 3 8 4]
 [6 1 1 2 6 8 4 7]]


But we can stack 2-dimensional arrays according to the depth dimension with the function **dstack()**.

In [58]:
 r3 = np.dstack([x2, y2])
print(x2)
print(r3.shape)
print()
print(r3)

[[4 6 8 0]
 [6 1 1 2]]
(2, 4, 2)

[[[4 7]
  [6 3]
  [8 8]
  [0 4]]

 [[6 6]
  [1 8]
  [1 4]
  [2 7]]]


Due to the fact that the resulting array now has 3 dimensions, the output with print is no longer as nice and easy to follow.

There is also a general function **stack()**, that allows you to define the axis to stack by.

For the splitting of arrays, NumPy provides the functions **split()**, **vsplit()**, **hsplit()**, and **dsplit()**, which work on a given axis, row wise, column wise, or depth wise, respectively. When no further information is passed to the functions, the split produces arrays of equal size. The number of times the array is to be split is passed as the first argument to the function.

Note that the function returns a tuple with as many sub-arrays as is requested by the respective argument. And that the result of a split function are views on arrays, rather than copies of arrays (like the slicing operation).

In [84]:
 print(x2)
print()
print(np.vsplit(x2, 2))
s1, s2 = np.vsplit(x2, 2)
print()
print(s1)
print()
print(s2)

[[4 6 8 0]
 [6 1 1 2]]

[array([[4, 6, 8, 0]]), array([[6, 1, 1, 2]])]

[[4 6 8 0]]

[[6 1 1 2]]


In [73]:
print(y2)
print()
s1, s2 = np.hsplit(y2, 2)
print(s1)
print()
print(s2)

[[7 3 8 4]
 [6 8 4 7]]

[[7 3]
 [6 8]]

[[8 4]
 [4 7]]


Instead of the number of times the split should be performed, you can also pass a list of indices as a second argument that defines exactly where the splits should happen according to the respective axis. (The returned sub-arrays are then not of equal size, but rather exactly how you defined it.)

In [88]:
print(x2)
print()
s1, s2, s3 = np.hsplit(x2, [0,2])
print()
print(s1)
print()
print(s2)
print()
print(s3)

[[4 6 8 0]
 [6 1 1 2]]


[]

[[4 6]
 [6 1]]

[[8 0]
 [1 2]]


# Closing Remarks

The exercise notebook just shows the most common ways of constructing, indexing, slicing, reshaping, combining, and splitting multidimensional arrays, and there are many more functions that can be used for these purposes. Check the official online documentation of NumPy for further ways. There are also a lot of tricks of the trade that can be used to accomplish some things in this regard in an elegant way. It is always worth searching the web to see if someone encountered the same problem and if someone else had a great way to solve it. 

Once you become more experienced with Numpy,  you will probably love to use it.