## Numpy: An Introduction

The topics and code covered in this notebook are sourced from W3 Schools (with little tweaks here and there) and from my own research.

The comments and explanations are either my own or from the website. Refer to the website for more examples and details. Read more [here](https://www.w3schools.com/python/numpy/default.asp).

### Basics

#### What is NumPy?

NumPy, or Numerical Python, is a Python library and is written partially in Python, but most of the parts that require fast computation are written in C or C++. 

It is used for working with arrays. It also has functions for working in domain of linear algebra, fourier transform, and matrices.

In Python we have lists that serve the purpose of arrays, but they are slow to process. NumPy aims to provide an array object that is up to 50x faster than traditional Python lists. The array object in NumPy is called "ndarray", it provides a lot of supporting functions that make working with "ndarray" very easy.

NumPy arrays are stored at one continuous place in memory unlike lists, so processes can access and manipulate them very efficiently. This behavior is called "locality of reference" in computer science. This is the main reason why NumPy is faster than lists. Also it is optimized to work with latest CPU architectures.

NumPy was created in 2005 by Travis Oliphant. It is an open source project. Read more [here](https://github.com/numpy/numpy).

#### Importing numpy

In [1]:
import numpy as np

#### Checking version

In [2]:
print(np.__version__)

1.21.5


#### Creating arrays

In [3]:
arr = np.array([1, 2, 3, 4, 5])

print(arr)

## prints type of array
print(type(arr))

[1 2 3 4 5]
<class 'numpy.ndarray'>


Note: The array object in numpy is called "ndarray"

In [4]:
## using tuple to create an array
arr = np.array((1, 2, 3, 4, 5))

print(arr)

[1 2 3 4 5]


Note: To create an ndarray, we can pass a list, tuple or any array-like object into the array() method, and it will be converted into an ndarray

#### Dimensions of an array

##### 0-D array

In [5]:
arr = np.array(42)

print(arr)

42


##### 1-D array

In [6]:
arr = np.array([1, 2, 3, 4, 5])

print(arr)

[1 2 3 4 5]


##### 2-D array

In [7]:
arr = np.array([[1, 2, 3], [4, 5, 6]])

print(arr)

[[1 2 3]
 [4 5 6]]


##### 3-D array

In [8]:
arr = np.array([[[1, 2, 3], [4, 5, 6]], [[1, 2, 3], [4, 5, 6]]])

print(arr)

[[[1 2 3]
  [4 5 6]]

 [[1 2 3]
  [4 5 6]]]


##### Checking number of dimensions in an array

Note: The "ndim" function checks for dimensions

In [9]:
a = np.array(42)
b = np.array([1, 2, 3, 4, 5])
c = np.array([[1, 2, 3], [4, 5, 6]])
d = np.array([[[1, 2, 3], [4, 5, 6]], [[1, 2, 3], [4, 5, 6]]])

print(a.ndim)
print(b.ndim)
print(c.ndim)
print(d.ndim)

0
1
2
3


Note: The "ndmin" functions defines the number of dimensions of the array

In [10]:
arr = np.array([1, 2, 3, 4], ndmin = 5)

print(arr)
print("Number of Dimensions: ", arr.ndim)

[[[[[1 2 3 4]]]]]
Number of Dimensions:  5


#### Array indexing

Note: Index (from the left) starts from 0

##### 1-D array

In [11]:
arr = np.array([1, 2, 3, 4])

## prints first element
print(arr[0])

1


In [12]:
arr = np.array([1, 2, 3, 4])

## adds third and fourth elements
print(arr[2] + arr[3])

7


##### 2-D array

In [13]:
arr = np.array([[1, 2, 3, 4, 5], [6, 7, 8, 9, 10]])

print(arr)
print("Second element on first row: ", arr[0, 1])

[[ 1  2  3  4  5]
 [ 6  7  8  9 10]]
Second element on first row:  2


In [14]:
arr = np.array([[1,2,3,4,5], [6,7,8,9,10]])

print("Fifth element on second row: ", arr[1, 4])

Fifth element on second row:  10


##### 3-D array

In [15]:
arr = np.array([[[1, 2, 3], [4, 5, 6]], [[7, 8, 9], [10, 11, 12]]])

print(arr[0, 1, 2])

6


##### Negative indexing

Note: Index (from the right) starts from -1

In [16]:
arr = np.array([[1, 2, 3, 4, 5], [6, 7, 8, 9, 10]])

print("Last element from second dimension: ", arr[1, -1])

Last element from second dimension:  10


#### Array slicing

##### 1-D array

In [17]:
arr = np.array([1, 2, 3, 4, 5, 6, 7])

### prints elements from index 1 to index 5
print(arr[1:5])

[2 3 4 5]


In [18]:
arr = np.array([1, 2, 3, 4, 5, 6, 7])

### prints elements from index 4 onwards
print(arr[4:])

[5 6 7]


In [19]:
arr = np.array([1, 2, 3, 4, 5, 6, 7])

### prints elements through index 4
print(arr[:4])

[1 2 3 4]


In [20]:
arr = np.array([1, 2, 3, 4, 5, 6, 7])

### prints elements from third-last position to second-last position
print(arr[-3:-1])

[5 6]


In [21]:
arr = np.array([1, 2, 3, 4, 5, 6, 7])

### prints elements from index 1 through index 5 but at intervals of 2
print(arr[1:5:2])

[2 4]


In [22]:
arr = np.array([1, 2, 3, 4, 5, 6, 7])

### prints every element at intervals of 3
print(arr[::3])

[1 4 7]


##### 2-D array

In [23]:
arr = np.array([[1, 2, 3, 4, 5], [6, 7, 8, 9, 10]])

print(arr)
print(" ")

### prints elements from index 1 to index 4 but only from the second row (row-index 1)
print(arr[1, 1:4])

[[ 1  2  3  4  5]
 [ 6  7  8  9 10]]
 
[7 8 9]


In [24]:
arr = np.array([[1, 2, 3, 4, 5], [6, 7, 8, 9, 10]])

print(arr)
print(" ")

### prints elements from index 0 to index 2 but only from the third column (column-index 2)
print(arr[0:2, 2])

[[ 1  2  3  4  5]
 [ 6  7  8  9 10]]
 
[3 8]


In [25]:
arr = np.array([[1, 2, 3, 4, 5], [6, 7, 8, 9, 10]])

print(arr)
print(" ")

### prints elements from row index 0 to row index 2 but only from columns with indices 1 upto 4
print(arr[0:2, 1:4])

[[ 1  2  3  4  5]
 [ 6  7  8  9 10]]
 
[[2 3 4]
 [7 8 9]]


### Datatypes

i - integer, b - boolean, u - unsigned integer, f - float, c - complex float, m - timedelta, M - datetime, O - object, S - string, U - unicode string, V - fixed chunk of memory for other type (void)

In [26]:
arr = np.array([1, 2, 3, 4])

print(arr.dtype)

int64


In [27]:
arr = np.array([1, 2, 3, 4], dtype = "S")

print(arr)
print(arr.dtype)

[b'1' b'2' b'3' b'4']
|S1


In [28]:
arr = np.array([1, 2, 3, 4], dtype = "i4")

print(arr)
print(arr.dtype)

[1 2 3 4]
int32


In [29]:
arr = np.array([1.1, 2.1, 3.1])

newarr = arr.astype(int)
## alternate code: newarr = arr.astype("i")

print(newarr)
print(newarr.dtype)

[1 2 3]
int64


In [30]:
arr = np.array([1, 0, 3])

newarr = arr.astype(bool)

print(newarr)
print(newarr.dtype)

[ True False  True]
bool


### Array copy vs. array view

Note: The main difference between a copy and a view of an array is that the copy is a new array, and the view is just a view of the original array.
The copy owns the data and any changes made to the copy will not affect original array, and any changes made to the original array will not affect the copy.
The view does not own the data and any changes made to the view will affect the original array, and any changes made to the original array will affect the view.

#### Copy

In [31]:
arr = np.array([1, 2, 3, 4, 5])
x = arr.copy()
arr[0] = 42

print(arr)
print(x)

[42  2  3  4  5]
[1 2 3 4 5]


#### View

In [32]:
arr = np.array([1, 2, 3, 4, 5])
x = arr.view()
arr[0] = 42

print(arr)
print(x)

[42  2  3  4  5]
[42  2  3  4  5]


##### Making changes in view

In [33]:
arr = np.array([1, 2, 3, 4, 5])
x = arr.view()
x[0] = 31

print(arr)
print(x)

[31  2  3  4  5]
[31  2  3  4  5]


### Array shape

In [34]:
arr = np.array([[1, 2, 3, 4], [5, 6, 7, 8]])

print(arr.shape)

(2, 4)


### Reshaping arrays

#### Reshaping from 1-D to 2-D

In [35]:
arr = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12])
newarr = arr.reshape(4, 3)

print(arr)
print(" ")
print(newarr)

[ 1  2  3  4  5  6  7  8  9 10 11 12]
 
[[ 1  2  3]
 [ 4  5  6]
 [ 7  8  9]
 [10 11 12]]


#### Reshaping from 1-D to 3-D

In [36]:
arr = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12])
newarr = arr.reshape(2, 3, 2)

print(arr)
print(" ")
print(newarr)

[ 1  2  3  4  5  6  7  8  9 10 11 12]
 
[[[ 1  2]
  [ 3  4]
  [ 5  6]]

 [[ 7  8]
  [ 9 10]
  [11 12]]]


Note: Arrays can be reshaped into any shape as long as the elements required for reshaping are equal in both shapes.
We can reshape an 8 elements 1D array into 4 elements in 2 rows 2D array but we cannot reshape it into a 3 elements 3 rows 2D array as that would require 3x3 = 9 elements.

#### Checking if reshaped array is copy or view

In [37]:
arr = np.array([1, 2, 3, 4, 5, 6, 7, 8])

print(arr)
print(" ")
print(arr.reshape(2, 4).base)

[1 2 3 4 5 6 7 8]
 
[1 2 3 4 5 6 7 8]


Note: The example above returns the original array, so it is a view.

#### Reshaping into unknown dimension

In [38]:
arr = np.array([1, 2, 3, 4, 5, 6, 7, 8])

### "-1" refers to an unknown dimension that numpy can calculate by itself
newarr = arr.reshape(2, 2, -1)

print(newarr)

[[[1 2]
  [3 4]]

 [[5 6]
  [7 8]]]


Note: Only one unknown dimension is allowed.

#### Flattening an array: Converting a multidimensional array into 1-D

In [39]:
arr = np.array([[1, 2, 3], [4, 5, 6]])

### "reshape(-1)" flattens an array
newarr = arr.reshape(-1)

print(newarr)

[1 2 3 4 5 6]


### Iterating through arrays

#### 1-D array

In [40]:
arr = np.array([1, 2, 3])

print(arr)
print(" ")

### loop runs through array and prints each element
for x in arr:
  print(x)

[1 2 3]
 
1
2
3


#### 2-D array

In [41]:
arr = np.array([[1, 2, 3], [4, 5, 6]])

print(arr)
print(" ")

### loop runs through each row and prints it
for x in arr:
  print(x)

[[1 2 3]
 [4 5 6]]
 
[1 2 3]
[4 5 6]


In [42]:
arr = np.array([[1, 2, 3], [4, 5, 6]])

print(arr)
print(" ")

### loop runs through each row and extracts and prints each element separately

## loop runs through each row
for x in arr:
  ## loop runs through elements of each row and prints it
  for y in x:
    print(y)

[[1 2 3]
 [4 5 6]]
 
1
2
3
4
5
6


#### 3-D array

In [43]:
arr = np.array([[[1, 2, 3], [4, 5, 6]], [[7, 8, 9], [10, 11, 12]]])

### loop runs through x-axis
for x in arr:
  print(x)

[[1 2 3]
 [4 5 6]]
[[ 7  8  9]
 [10 11 12]]


In [44]:
arr = np.array([[[1, 2, 3], [4, 5, 6]], [[7, 8, 9], [10, 11, 12]]])

### loop runs through x-axis
for x in arr:
  ## for each point on x-axis loop runs through yz-axis
  for y in x:
    ## for each point on xy-axis loop runs through elements in remaining z-axis and prints it
    for z in y:
      print(z)

1
2
3
4
5
6
7
8
9
10
11
12


In [45]:
arr = np.array([[[1, 2], [3, 4]], [[5, 6], [7, 8]]])

### the "nditer()" function traverses through the array (row-wise), separately storing each element
for x in np.nditer(arr):
  print(x)

1
2
3
4
5
6
7
8


Note: In basic for loops, iterating through each scalar of an array we need to use "n" for loops which can be difficult to write for arrays with very high dimensionality. The function "nditer()" is a helping function that can be used from very basic to very advanced iterations.

#### Iterating through arrays of different datatypes

Note: We can use op_dtypes argument and pass it the expected datatype to change the datatype of elements while iterating.Numpy does not change the data type of the element in-place (where the element is in array) so it needs some other space to perform this action, that extra space is called buffer, and in order to enable it in nditer() we pass flags=['buffered'].

###### Iterating through array as a string

In [46]:
arr = np.array([1, 2, 3])

for x in np.nditer(arr, flags = ["buffered"], op_dtypes = ["S"]):
  print(x)

b'1'
b'2'
b'3'


###### Iterating through every element at intervals of 2

In [47]:
arr = np.array([[1, 2, 3, 4], [5, 6, 7, 8]])

for x in np.nditer(arr[:, ::2]):
  print(x)

1
3
5
7


##### Enumerated iteration

Note: Enumeration means mentioning sequence/index number of somethings one by one.

In [48]:
arr = np.array([1, 2, 3])

for idx, x in np.ndenumerate(arr):
  print(idx, x)

(0,) 1
(1,) 2
(2,) 3


In [49]:
arr = np.array([[1, 2, 3, 4], [5, 6, 7, 8]])

for idx, x in np.ndenumerate(arr):
  print(idx, x)

(0, 0) 1
(0, 1) 2
(0, 2) 3
(0, 3) 4
(1, 0) 5
(1, 1) 6
(1, 2) 7
(1, 3) 8


### Joining arrays

In [50]:
arr1 = np.array([1, 2, 3])
arr2 = np.array([4, 5, 6])

arr = np.concatenate((arr1, arr2))

print(arr)

[1 2 3 4 5 6]


In [51]:
arr1 = np.array([[1, 2], [3, 4]])
arr2 = np.array([[5, 6], [7, 8]])

### joins two 2-D arrays horizontally
arr = np.concatenate((arr1, arr2), axis = 1)

print(arr1)
print(" ")
print(arr2)
print(" ")
print(arr)

[[1 2]
 [3 4]]
 
[[5 6]
 [7 8]]
 
[[1 2 5 6]
 [3 4 7 8]]


In [52]:
arr1 = np.array([1, 2, 3])
arr2 = np.array([4, 5, 6])

### reorients arrays vertically and then joins them horizontally
arr = np.stack((arr1, arr2), axis = 1)

print(arr1)
print(" ")
print(arr2)
print(" ")
print(arr)

[1 2 3]
 
[4 5 6]
 
[[1 4]
 [2 5]
 [3 6]]


#### Horizontal stacking (stacking along rows): hstack( )

In [53]:
arr1 = np.array([1, 2, 3])
arr2 = np.array([4, 5, 6])

### combines both arrays into one horizontally
arr = np.hstack((arr1, arr2))

print(arr1)
print(" ")
print(arr2)
print(" ")
print(arr)

[1 2 3]
 
[4 5 6]
 
[1 2 3 4 5 6]


#### Vertical stacking (stacking along columns): vstack( )

In [54]:
arr1 = np.array([1, 2, 3])
arr2 = np.array([4, 5, 6])

### stacks one array on top of the other
arr = np.vstack((arr1, arr2))

print(arr1)
print(" ")
print(arr2)
print(" ")
print(arr)

[1 2 3]
 
[4 5 6]
 
[[1 2 3]
 [4 5 6]]


#### Height/depth stacking: dstack( ) 

In [55]:
arr1 = np.array([1, 2, 3])
arr2 = np.array([4, 5, 6])

### reorients arrays vertically and then joins horizontally
arr = np.dstack((arr1, arr2))

print(arr1)
print(" ")
print(arr2)
print(" ")
print(arr)

[1 2 3]
 
[4 5 6]
 
[[[1 4]
  [2 5]
  [3 6]]]


### Splitting arrays

In [56]:
arr = np.array([1, 2, 3, 4, 5, 6])

### splits array into three parts
newarr = np.array_split(arr, 3)

print(newarr)

[array([1, 2]), array([3, 4]), array([5, 6])]


In [57]:
arr = np.array([1, 2, 3, 4, 5, 6])

### splits array into four parts
newarr = np.array_split(arr, 4)

print(newarr)

[array([1, 2]), array([3, 4]), array([5]), array([6])]


Note: If the array has less elements than required, it will adjust from the end accordingly.

#### Splitting into arrays

In [58]:
arr = np.array([1, 2, 3, 4, 5, 6])

### splits each part into individual arrays
newarr = np.array_split(arr, 3)

print(newarr[0])
print(newarr[1])
print(newarr[2])

[1 2]
[3 4]
[5 6]


In [59]:
arr = np.array([[1, 2], [3, 4], [5, 6], [7, 8], [9, 10], [11, 12]])

### splits into three arrays 
newarr = np.array_split(arr, 3)

print(arr)
print(" ")
print(newarr)

[[ 1  2]
 [ 3  4]
 [ 5  6]
 [ 7  8]
 [ 9 10]
 [11 12]]
 
[array([[1, 2],
       [3, 4]]), array([[5, 6],
       [7, 8]]), array([[ 9, 10],
       [11, 12]])]


In [60]:
arr = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9], [10, 11, 12], [13, 14, 15], [16, 17, 18]])

### splits into three arrays 
newarr = np.array_split(arr, 3)

print(arr)
print(" ")
print(newarr)

[[ 1  2  3]
 [ 4  5  6]
 [ 7  8  9]
 [10 11 12]
 [13 14 15]
 [16 17 18]]
 
[array([[1, 2, 3],
       [4, 5, 6]]), array([[ 7,  8,  9],
       [10, 11, 12]]), array([[13, 14, 15],
       [16, 17, 18]])]


In [61]:
arr = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9], [10, 11, 12], [13, 14, 15], [16, 17, 18]])

### splits array along rows
newarr = np.array_split(arr, 3, axis = 1)
## alternate code: newarr = np.hsplit(arr, 3)
## hsplit() is the opposite of hstack()

print(arr)
print(" ")
print(newarr)

[[ 1  2  3]
 [ 4  5  6]
 [ 7  8  9]
 [10 11 12]
 [13 14 15]
 [16 17 18]]
 
[array([[ 1],
       [ 4],
       [ 7],
       [10],
       [13],
       [16]]), array([[ 2],
       [ 5],
       [ 8],
       [11],
       [14],
       [17]]), array([[ 3],
       [ 6],
       [ 9],
       [12],
       [15],
       [18]])]


### Searching arrays

In [62]:
arr = np.array([1, 2, 3, 4, 5, 4, 4])

### prints indices where values are "4"
x = np.where(arr == 4)

print(x)

(array([3, 5, 6]),)


In [63]:
arr = np.array([1, 2, 3, 4, 5, 6, 7, 8])

### prints indices with even numbers
x = np.where(arr%2 == 0)

print(x)

(array([1, 3, 5, 7]),)


In [64]:
arr = np.array([1, 2, 3, 4, 5, 6, 7, 8])

### prints indices with odd numbers
x = np.where(arr%2 == 1)

print(x)

(array([0, 2, 4, 6]),)


#### Search sorted arrays

Note: searchsorted() performs a binary search in the array, and returns the index where the specified value would be inserted to maintain the search order. The searchsorted() method is assumed to be used on sorted arrays.

In [65]:
arr = np.array([6, 7, 8, 9])

### prints index where "7" should be inserted to maintain the order
x = np.searchsorted(arr, 7)

print(x)

1


##### Search sorted from right-side

Note: By default the left most index is returned, but we can give side = "right" to return the right most index instead.

In [66]:
arr = np.array([6, 7, 8, 9])

### starts the search from the right and returns the first index where the number 7 is no longer less than the next value
x = np.searchsorted(arr, 7, side = "right")

print(x)

2


In [67]:
arr = np.array([1, 3, 5, 7])

### prints indices where values "2", "4" and "6" are to be inserted to maintain order
x = np.searchsorted(arr, [2, 4, 6])

print(x)

[1 2 3]


### Sorting arrays

In [68]:
arr = np.array([3, 2, 0, 1])

print(np.sort(arr))

[0 1 2 3]


In [69]:
arr = np.array(["banana", "cherry", "apple"])

print(np.sort(arr))

['apple' 'banana' 'cherry']


In [70]:
arr = np.array([True, False, True])

print(np.sort(arr))

[False  True  True]


In [71]:
arr = np.array([[3, 2, 4], [5, 0, 1]])

print(arr)
print(" ")
print(np.sort(arr))

[[3 2 4]
 [5 0 1]]
 
[[2 3 4]
 [0 1 5]]


### Filtering arrays

In [72]:
arr = np.array([41, 42, 43, 44])
x = [True, False, True, False]

newarr = arr[x]

### returns elements from "arr" which has corresponding indices as "True" elements in "x"
print(newarr)

[41 43]


In [73]:
arr = np.array([41, 42, 43, 44])

### creates an empty list
filter_arr = []

### loop runs through each element in "arr"
for element in arr:
    
  ### if the element is higher than 42, set the value to "True", otherwise "False"
  if element > 42:
    filter_arr.append(True)
  else:
    filter_arr.append(False)

newarr = arr[filter_arr]

print(filter_arr)
print(newarr)

[False, False, True, True]
[43 44]


In [74]:
arr = np.array([1, 2, 3, 4, 5, 6, 7])

### creates an empty list
filter_arr = []

### loop runs through each element in "arr"
for element in arr:
    
  ### if the element is even, set the value to "True", otherwise "False"
  if element % 2 == 0:
    filter_arr.append(True)
  else:
    filter_arr.append(False)

newarr = arr[filter_arr]

print(filter_arr)
print(newarr)

[False, True, False, True, False, True, False]
[2 4 6]


In [75]:
arr = np.array([41, 42, 43, 44])

filter_arr = arr > 42

newarr = arr[filter_arr]

print(filter_arr)
print(newarr)

[False False  True  True]
[43 44]


In [76]:
arr = np.array([1, 2, 3, 4, 5, 6, 7])

### sets up conditional filter
filter_arr = (arr % 2 == 0)

newarr = arr[filter_arr]

print(filter_arr)
print(newarr)

[False  True False  True False  True False]
[2 4 6]
