# NumPy Basics - Part 1

NumPy forms the bedrock of the Python scientific stack. It provides a key data structure, n-dimensional array, with fast mathematical operations. 

In [1]:
import numpy as np

# create a numpy array of integers from 1 to 50
a = np.array(np.arange(1, 51))
print(a)

[ 1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50]


NumPy arrays have attributes *ndim*, *shape* and *size*.

In [3]:
print("ndim = {}, shape = {} and size = {}.".format(a.ndim, a.shape, a.size))

ndim = 1, shape = (50,) and size = 50.


Data type, the size of each element (in bytes) and total size of the array (in bytes) are some other useful attributes exposed by NumPy arrays.

In [4]:
print("data type = {}, element size = {}, total size = {}".format(a.dtype, a.itemsize, a.nbytes))

data type = int64, element size = 8, total size = 400


Indexing and subscripting work just like in standard Python.

## Accessing Elements of One-Dimensional Arrays Through Indexing

In [5]:
# access the first and last element of the array
print(a[0], ",", a[-1])

1 , 50


In [6]:
# accessing the second last and third last element of the array
print(a[-2], ',', a[-3])

49 , 48


## Accessing Multi-Dimensional Array Elements Through Indices

In the case of multi-dimensional arrays, too, individual array elements can be accessed via tuples of comma separated indices.

In [7]:
np.random.seed(5)
b = np.random.randint(50, size=(10, 5))
print(b)

[[35 14 47 38 16]
 [ 9  8 36 39 27]
 [48 30 16  7 12]
 [15 49 39 16 27]
 [44 13 11  1 47]
 [30 20 22 18  9]
 [42 41 41  1 18]
 [39 16 14  5  0]
 [16  4 46 36 41]
 [27 31  2  4 38]]


In [8]:
# access individual elements
b[0, 2]

47

In [9]:
# access the element in the last row and last column
b[-1, -1]

38

In [10]:
# access the element in the first row, first column
b[0, 0]

35

In [11]:
# update the element in the last row, last column
b[-1, -1] = 100

In [12]:
b

array([[ 35,  14,  47,  38,  16],
       [  9,   8,  36,  39,  27],
       [ 48,  30,  16,   7,  12],
       [ 15,  49,  39,  16,  27],
       [ 44,  13,  11,   1,  47],
       [ 30,  20,  22,  18,   9],
       [ 42,  41,  41,   1,  18],
       [ 39,  16,  14,   5,   0],
       [ 16,   4,  46,  36,  41],
       [ 27,  31,   2,   4, 100]])

## Slicing Subarrays off One-Dimensional Arrays

Just like in standard Python, colon ":" acts as the list slicing operator in NumPy.

In [13]:
# print the first 25 elements of 'a'
a[:25]

array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17,
       18, 19, 20, 21, 22, 23, 24, 25])

In [15]:
# print the last 25 elements of the array
a[25:]

array([26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42,
       43, 44, 45, 46, 47, 48, 49, 50])

In [16]:
# print only the odd elements of the array starting with integer 3, which is in array position 2
a[2::2]

array([ 3,  5,  7,  9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35,
       37, 39, 41, 43, 45, 47, 49])

In [17]:
# print the reverse of the array 'a'
a[::-1]

array([50, 49, 48, 47, 46, 45, 44, 43, 42, 41, 40, 39, 38, 37, 36, 35, 34,
       33, 32, 31, 30, 29, 28, 27, 26, 25, 24, 23, 22, 21, 20, 19, 18, 17,
       16, 15, 14, 13, 12, 11, 10,  9,  8,  7,  6,  5,  4,  3,  2,  1])

In [18]:
# print the reverse of the array 'a' starting at the second last element, i.e. 49
a[-2::-2]

array([49, 47, 45, 43, 41, 39, 37, 35, 33, 31, 29, 27, 25, 23, 21, 19, 17,
       15, 13, 11,  9,  7,  5,  3,  1])

## Slicing Subarrays off Multi-Dimensional Arrays

Slicing subarrays from multi-dimensional arrays works the same way. The slices for different dimensions are separated by commas within square brackets.

In [19]:
# print b
print(b)

[[ 35  14  47  38  16]
 [  9   8  36  39  27]
 [ 48  30  16   7  12]
 [ 15  49  39  16  27]
 [ 44  13  11   1  47]
 [ 30  20  22  18   9]
 [ 42  41  41   1  18]
 [ 39  16  14   5   0]
 [ 16   4  46  36  41]
 [ 27  31   2   4 100]]


In [20]:
# form a subarray comprising the first three rows and columns
b[:3, :3]


array([[35, 14, 47],
       [ 9,  8, 36],
       [48, 30, 16]])

In [21]:
# form a subarray comprising the last 2 rows and columns
b[-2:, -2:]

array([[ 36,  41],
       [  4, 100]])

In [22]:
# subarray comprising every other row, all columns
b[::2, :]


array([[35, 14, 47, 38, 16],
       [48, 30, 16,  7, 12],
       [44, 13, 11,  1, 47],
       [42, 41, 41,  1, 18],
       [16,  4, 46, 36, 41]])

In [23]:
# Reverse subarray dimensions
b[::-1, ::-1]

array([[100,   4,   2,  31,  27],
       [ 41,  36,  46,   4,  16],
       [  0,   5,  14,  16,  39],
       [ 18,   1,  41,  41,  42],
       [  9,  18,  22,  20,  30],
       [ 47,   1,  11,  13,  44],
       [ 27,  16,  39,  49,  15],
       [ 12,   7,  16,  30,  48],
       [ 27,  39,  36,   8,   9],
       [ 16,  38,  47,  14,  35]])

In [24]:
# Accessing first row of the array
b[0,:]

array([35, 14, 47, 38, 16])

In [25]:
# accessing the last column of the array
b[:, -1]

array([ 16,  27,  12,  27,  47,   9,  18,   0,  41, 100])

When accessing rows, the column slice is redundant. Here is a neat way to access the first row only.

In [26]:
b[0]

array([35, 14, 47, 38, 16])

First 2 rows only.

In [27]:
b[:2]

array([[35, 14, 47, 38, 16],
       [ 9,  8, 36, 39, 27]])

Last 2 rows only.

In [28]:
# last 2 rows
b[-2:]

array([[ 16,   4,  46,  36,  41],
       [ 27,  31,   2,   4, 100]])

## Subarrays are Views!

Unlike Python lists, slicing NumPy arrays creates views. This can be observed by changing some values of a subarray and checking the changes in the parent array.

In [29]:
print(b)

[[ 35  14  47  38  16]
 [  9   8  36  39  27]
 [ 48  30  16   7  12]
 [ 15  49  39  16  27]
 [ 44  13  11   1  47]
 [ 30  20  22  18   9]
 [ 42  41  41   1  18]
 [ 39  16  14   5   0]
 [ 16   4  46  36  41]
 [ 27  31   2   4 100]]


In [30]:
c = b[:2,:2]
print(c)

[[35 14]
 [ 9  8]]


Change a value in the subarray 'c' and observe the same in array 'b'.

In [31]:
c[0, 1] = 1000
print(c)

[[  35 1000]
 [   9    8]]


Let's see if the element b[0, 1] has been changed to 1000.

In [32]:
b

array([[  35, 1000,   47,   38,   16],
       [   9,    8,   36,   39,   27],
       [  48,   30,   16,    7,   12],
       [  15,   49,   39,   16,   27],
       [  44,   13,   11,    1,   47],
       [  30,   20,   22,   18,    9],
       [  42,   41,   41,    1,   18],
       [  39,   16,   14,    5,    0],
       [  16,    4,   46,   36,   41],
       [  27,   31,    2,    4,  100]])

Subarrays as views to arrays becomes important when processing huge datasets.

If a copy of an array needs to be created, the *copy()* array method can be used.

In [33]:
d = b[:2, :2].copy()
print(d)

[[  35 1000]
 [   9    8]]



Change 1,l000 to 500 in array 'd'.

In [34]:
d[0, 1] = 500
print(d)

[[ 35 500]
 [  9   8]]


Check array 'b' to ensure element [0, 1] still  has 1000.

In [35]:
print(b)

[[  35 1000   47   38   16]
 [   9    8   36   39   27]
 [  48   30   16    7   12]
 [  15   49   39   16   27]
 [  44   13   11    1   47]
 [  30   20   22   18    9]
 [  42   41   41    1   18]
 [  39   16   14    5    0]
 [  16    4   46   36   41]
 [  27   31    2    4  100]]


As expected, the parent array 'b' did not change.