# NumPy

Source: https://www.w3schools.com/python/numpy/

## Create and Access Arrays

In [2]:
import numpy as np

arr = np.array([1, 2, 3, 4])
print(arr[0])

1


In [6]:
arr = np.array([1, 2, 3, 4])
print(arr[0] + arr[-1])

5


In [11]:
arr = np.array([[1,2,3,4,5], [6,7,8,9,10]])
print('2nd element on 1st row: ', arr[0, 1])
print('2nd element on 1st row: ', arr[0][1])

2nd element on 1st row:  2
2nd element on 1st row:  2


In [12]:
print('5th element on 2nd row: ', arr[1, 4])

5th element on 2nd row:  10


In [14]:
# Access 3-D Arrays
arr = np.array([[[1, 2, 3], [4, 5, 6]], [[7, 8, 9], [10, 11, 12]]])
print(arr[0, 1, 2])

6


In [13]:
arr = np.array([[1,2,3,4,5], [6,7,8,9,10]])
print('Last element from 2nd dim: ', arr[1, -1])

Last element from 2nd dim:  10


## Array Slicing

In [16]:
# We can also define the step, like this: [start:end:step]
arr = np.array([1, 2, 3, 4, 5, 6, 7])
print(arr[1:5:2])
print(arr[1:5])
print(arr[4:])
print(arr[:4])
print(arr[-3:-1])
print(arr[::2])

[2 4]
[2 3 4 5]
[5 6 7]
[1 2 3 4]
[5 6]
[1 3 5 7]


In [23]:
# Slicing 2-D Arrays
arr = np.array([[1, 2, 3, 4, 5], [6, 7, 8, 9, 10]])
print(arr[1, 1:4])  # From the second element, slice elements from index 1 to index 4 (not included):
print(arr[0:2, 2])  # From both elements, return index 2:
print(arr[0:2, 1:4])  # From both elements, slice index 1 to index 4 (not included), this will return a 2-D array:

[7 8 9]
[3 8]
[[2 3 4]
 [7 8 9]]


## Data Types in NumPy

* i - integer
* b - boolean
* u - unsigned integer
* f - float
* c - complex float
* m - timedelta
* M - datetime
* O - object
* S - string
* U - unicode string
* V - fixed chunk of memory for other type (void)

In [20]:
# Get the data type
arr = np.array([1, 2, 3, 4])
print(arr.dtype)

arr = np.array(['apple', 'banana', 'cherry'])
print(arr.dtype)

int64
<U6


In [21]:
# Create an array with data type string:
arr = np.array([1, 2, 3, 4], dtype='S')
print(arr)
print(arr.dtype)

[b'1' b'2' b'3' b'4']
|S1


In [24]:
# Create an array with data type 4 bytes integer:
arr = np.array([1, 2, 3, 4], dtype='i4')
print(arr)
print(arr.dtype)

arr = np.array([1, 2, 3, 4], dtype='i')
print(arr)
print(arr.dtype)

[1 2 3 4]
int32
[1 2 3 4]
int32


In [27]:
# Change data type from float to integer by using 'i' as parameter value
arr = np.array([1.1, 2.1, 3.1])
new_arr = arr.astype('i')
print(new_arr)
print(new_arr.dtype)

# Change data type from float to integer by using int as parameter value
arr = np.array([1.1, 2.1, 3.1])
new_arr = arr.astype(int)
print(new_arr)
print(new_arr.dtype)

# Change data type from integer to boolean
arr = np.array([1, 0, 3])
new_arr = arr.astype(bool)
print(new_arr)
print(new_arr.dtype)

[1 2 3]
int32
[1 2 3]
int64
[ True False  True]
bool


## The Difference Between Copy and View

* The main difference between a copy and a view of an array is that the copy is a new array, and the view is just a view of the original array.
* The copy owns the data and any changes made to the copy will not affect original array, and any changes made to the original array will not affect the copy.
* The view does not own the data and any changes made to the view will affect the original array, and any changes made to the original array will affect the view.

In [29]:
# Copy
arr = np.array([1, 2, 3, 4, 5])
x = arr.copy()
arr[0] = 42
x[-1] = 99
print(arr)
print(x)

[42  2  3  4  5]
[ 1  2  3  4 99]


In [30]:
# View
arr = np.array([1, 2, 3, 4, 5])
x = arr.view()
arr[0] = 42
x[-1] = 99
print(arr)
print(x)

[42  2  3  4 99]
[42  2  3  4 99]


## Check if Array Owns its Data

* As mentioned above, copies owns the data, and views does not own the data, but how can we check this?
* Every NumPy array has the attribute `base` that returns None if the array owns the data.
* Otherwise, the base` attribute refers to the original object.

In [None]:
arr = np.array([1, 2, 3, 4, 5])

x = arr.copy()
y = arr.view()

print(x.base)  # The copy returns None.
print(y.base)  # The view returns the original array.

## Get the Shape of an Array

* Arrays in NumPy have an attribute called `shape` that returns a tuple with each index having the number of corresponding elements.
* The example above returns `(2, 4)`, which means that the array has 2 dimensions, where the first dimension has 2 elements and the second has 4.

In [32]:
# Print the shape of a 2-D array
arr = np.array([[1, 2, 3, 4], [5, 6, 7, 8]])
print(arr.shape)

(2, 4)


In [34]:
# Creating a 1D array with minimum 2 dimensions
arr = np.array([1, 2, 3], ndmin=2)
print(arr)
print("Shape:", arr.shape)


[[1 2 3]]
Shape: (1, 3)


## Reshaping arrays

* Reshaping means changing the shape of an array.
* The shape of an array is the number of elements in each dimension.
* By reshaping we can add or remove dimensions or change number of elements in each dimension.

In [36]:
# Convert the following 1-D array with 12 elements into a 2-D array.
# The outermost dimension will have 4 arrays, each with 3 elements.
arr = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12])
print(arr.size)
new_arr = arr.reshape(4, 3)
print(new_arr)

# Convert the following 1-D array with 12 elements into a 3-D array.
# The outermost dimension will have 2 arrays that contains 3 arrays, each with 2 elements:
new_arr = arr.reshape(2, 3, 2)
print(new_arr)

12
[[ 1  2  3]
 [ 4  5  6]
 [ 7  8  9]
 [10 11 12]]
[[[ 1  2]
  [ 3  4]
  [ 5  6]]

 [[ 7  8]
  [ 9 10]
  [11 12]]]


## Can We Reshape Into any Shape?

* Yes, as long as the elements required for reshaping are equal in both shapes.
* We can reshape 8 elements 1D array into 4 elements in 2 rows 2D array, but we cannot reshape it into 3 elements 3 rows 2D array as that would require 3x3 = 9 elements.

In [37]:
# Try converting 1D array with 8 elements to a 2D array with 3 elements in each dimension (will raise an error).

arr = np.array([1, 2, 3, 4, 5, 6, 7, 8])
new_arr = arr.reshape(3, 3)
print(new_arr)

ValueError: cannot reshape array of size 8 into shape (3,3)

In [38]:
# Check if the returned array is a copy or a view.
arr = np.array([1, 2, 3, 4, 5, 6, 7, 8])
print(arr.reshape(2, 4).base)  # The example above returns the original array, so it is a view.

[1 2 3 4 5 6 7 8]


## Unknown Dimension

* You are allowed to have one "unknown" dimension.
* Pass `-1` as the value, and NumPy will calculate this number for you.
> We can not pass `-1` to more than one dimension.

In [39]:
arr = np.array([1, 2, 3, 4, 5, 6, 7, 8])
new_arr = arr.reshape(2, 2, -1)
print(new_arr)

[[[1 2]
  [3 4]]

 [[5 6]
  [7 8]]]


## Flattening the arrays

* Flattening array means converting a multidimensional array into a 1D array
* We can use `reshape(-1)` to do this.
> There are a lot of functions for changing the shapes of arrays in numpy `flatten`, `ravel` and also for rearranging the
elements `rot90`, `flip`, `fliplr`, `flipud` etc. These fall under Intermediate to Advanced section of NumPy.

In [40]:
arr = np.array([[1, 2, 3], [4, 5, 6]])
new_arr = arr.reshape(-1)
print(new_arr)

[1 2 3 4 5 6]


## Iterating

* If we iterate on `n-D` array it will go through `n-1th` dimension one by one.

In [41]:
# Iterate on the elements of the following 1-D array.
arr = np.array([1, 2, 3])
for x in arr:
  print(x)

1
2
3


In [43]:
# In a 2-D array it will go through all the rows.
arr = np.array([[1, 2, 3], [4, 5, 6]])
for x in arr:
  print(x)

123456

In [45]:
# To return the actual values, the scalars, we have to iterate the arrays in each dimension.
arr = np.array([[1, 2, 3], [4, 5, 6]])

for x in arr:
    for y in x:
        print(y, end=' ')

1 2 3 4 5 6 

In [46]:
# In a 3-D array it will go through all the 2-D arrays.
arr = np.array([[[1, 2, 3], [4, 5, 6]], [[7, 8, 9], [10, 11, 12]]])
for x in arr:
  print(x)

[[1 2 3]
 [4 5 6]]
[[ 7  8  9]
 [10 11 12]]


In [47]:
# To return the actual values, the scalars, we have to iterate the arrays in each dimension.
for x in arr:
  for y in x:
    for z in y:
      print(z, end=' ')

1 2 3 4 5 6 7 8 9 10 11 12 

### Iterating Arrays Using `nditer()`

* In basic for loops, iterating through each scalar of an array we need to use n for loops which can be difficult to write for arrays with very high dimensionality.

In [49]:
arr = np.array([[[1, 2], [3, 4]], [[5, 6], [7, 8]]])

for x in np.nditer(arr):
  print(x, end=' ')

1 2 3 4 5 6 7 8 

### Iterating Array With Different Data Types

In [50]:
arr = np.array([1, 2, 3])

for x in np.nditer(arr, flags=['buffered'], op_dtypes=['S']):
  print(x)

b'1'
b'2'
b'3'


In [None]:
# Iterating With Different Step Size

In [51]:
arr = np.array([[1, 2, 3, 4], [5, 6, 7, 8]])
for x in np.nditer(arr[:, ::2]):
  print(x)

1
3
5
7


### Enumerated Iteration Using ndenumerate()

In [56]:
# Enumerate on following 1D arrays elements
arr = np.array([1, 2, 3])
for idx, x in np.ndenumerate(arr):
  print(idx, x)

(0,) 1
(1,) 2
(2,) 3


In [57]:
# Enumerate on following 2D array's elements.
arr = np.array([[1, 2, 3, 4], [5, 6, 7, 8]])
for idx, x in np.ndenumerate(arr):
  print(idx, x)

(0, 0) 1
(0, 1) 2
(0, 2) 3
(0, 3) 4
(1, 0) 5
(1, 1) 6
(1, 2) 7
(1, 3) 8


In [58]:
# Enumerate on following 3D array's elements.
arr = np.array([[[1, 2, 3], [4, 5, 6]], [[7, 8, 9], [10, 11, 12]]])
for idx, x in np.ndenumerate(arr):
  print(idx, x)

(0, 0, 0) 1
(0, 0, 1) 2
(0, 0, 2) 3
(0, 1, 0) 4
(0, 1, 1) 5
(0, 1, 2) 6
(1, 0, 0) 7
(1, 0, 1) 8
(1, 0, 2) 9
(1, 1, 0) 10
(1, 1, 1) 11
(1, 1, 2) 12


## Joining NumPy Array

* We pass a sequence of arrays that we want to join to the `concatenate()` function, along with the axis.
* If axis is not explicitly passed, it is taken as `0`.

In [59]:
# Join two arrays.
arr1 = np.array([1, 2, 3])
arr2 = np.array([4, 5, 6])
arr = np.concatenate((arr1, arr2))
print(arr)

[1 2 3 4 5 6]


In [4]:
# Join two 2-D arrays along rows.
arr1 = np.array([[1, 2], [3, 4]])
arr2 = np.array([[5, 6], [7, 8]])

arr = np.concatenate((arr1, arr2), axis=0)
print(arr)

arr = np.concatenate((arr1, arr2), axis=1)
print(arr)


[[1 2]
 [3 4]
 [5 6]
 [7 8]]
[[1 2 5 6]
 [3 4 7 8]]


## Stacking

### Stacking Along Rows

In [5]:
arr1 = np.array([1, 2, 3])
arr2 = np.array([4, 5, 6])
arr = np.hstack((arr1, arr2))
print(arr)

[1 2 3 4 5 6]


### Stacking Along Columns

In [6]:
arr1 = np.array([1, 2, 3])
arr2 = np.array([4, 5, 6])
arr = np.vstack((arr1, arr2))
print(arr)

[[1 2 3]
 [4 5 6]]


### Stacking Along Height (depth)

In [7]:
arr1 = np.array([1, 2, 3])
arr2 = np.array([4, 5, 6])
arr = np.dstack((arr1, arr2))
print(arr)

[[[1 4]
  [2 5]
  [3 6]]]


## NumPy Splitting Array

* Splitting is reverse operation of Joining.
* We also have the method `split()` available, but it will not adjust the elements when elements are less in source array for splitting like in example below, `array_split()` worked properly but split() would fail.

In [16]:
# Split the array in 3 parts.
arr = np.array([1, 2, 3, 4, 5, 6])
new_arr = np.array_split(arr, 3)
print(new_arr)

# Access the split arrays
print(new_arr[0])
print(new_arr[1])
print(new_arr[2])

[array([1, 2]), array([3, 4]), array([5, 6])]
[1 2]
[3 4]
[5 6]


In [17]:
# If the array has fewer elements than required, it will adjust from the end accordingly.
# Split the array in 4 parts.
arr = np.array([1, 2, 3, 4, 5, 6])
new_arr = np.array_split(arr, 4)
print(new_arr)

[array([1, 2]), array([3, 4]), array([5]), array([6])]


In [18]:
# Splitting 2-D Arrays. The example returns three 2-D arrays.
arr = np.array([[1, 2], [3, 4], [5, 6], [7, 8], [9, 10], [11, 12]])
new_arr = np.array_split(arr, 3)
print(new_arr)

[array([[1, 2],
       [3, 4]]), array([[5, 6],
       [7, 8]]), array([[ 9, 10],
       [11, 12]])]


In [19]:
# This time each element in the 2-D arrays contains 3 elements.
arr = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9], [10, 11, 12], [13, 14, 15], [16, 17, 18]])
new_arr = np.array_split(arr, 3)
print(new_arr)

[array([[1, 2, 3],
       [4, 5, 6]]), array([[ 7,  8,  9],
       [10, 11, 12]]), array([[13, 14, 15],
       [16, 17, 18]])]


In [20]:
# You can specify which axis you want to do the split around.
arr = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9], [10, 11, 12], [13, 14, 15], [16, 17, 18]])
new_arr = np.array_split(arr, 3, axis=1)
print(new_arr)

[array([[ 1],
       [ 4],
       [ 7],
       [10],
       [13],
       [16]]), array([[ 2],
       [ 5],
       [ 8],
       [11],
       [14],
       [17]]), array([[ 3],
       [ 6],
       [ 9],
       [12],
       [15],
       [18]])]


In [21]:
# An alternate solution is using hsplit() opposite of hstack()
arr = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9], [10, 11, 12], [13, 14, 15], [16, 17, 18]])
new_arr = np.hsplit(arr, 3)
print(new_arr)

[array([[ 1],
       [ 4],
       [ 7],
       [10],
       [13],
       [16]]), array([[ 2],
       [ 5],
       [ 8],
       [11],
       [14],
       [17]]), array([[ 3],
       [ 6],
       [ 9],
       [12],
       [15],
       [18]])]


> Similar alternates to `vstack()` and `dstack()` are available as `vsplit()` and `dsplit()`.

## Searching Arrays

Get the indices of a value in the array.

In [27]:
arr = np.array([1, 2, 3, 4, 5, 4, 4])
print(np.where(arr == 4))  # The value 4 is present at index 3, 5, and 6.
print(np.where(arr == 1))
print(np.where(arr == -99))

(array([3, 5, 6]),)
(array([0]),)
(array([], dtype=int64),)
(array([1, 3, 5, 7]),)


In [28]:
# Find the indexes where the values are even.
arr = np.array([1, 2, 3, 4, 5, 6, 7, 8])
x = np.where(arr%2 == 0)
print(x)

(array([1, 3, 5, 7]),)


In [None]:
# Find the indexes where the values are odd.
arr = np.array([1, 2, 3, 4, 5, 6, 7, 8])
x = np.where(arr%2 == 1)
print(x)

### Search Sorted

* There is a method called `searchsorted()` which performs a binary search in the array, and returns the index where the specified value would be inserted to maintain the search order.
> The `searchsorted()` method is assumed to be used on sorted arrays.

In [29]:
# Find the indexes where the value 7 should be inserted.
# Example explains that the number 7 should be inserted on index 1 to remain the sort order.
arr = np.array([6, 7, 8, 9])
x = np.searchsorted(arr, 7)
print(x)

1


> By default, the left most index is returned, but we can give `side='right'` to return the right most index instead.

In [30]:
# The example explains that the number 7 should be inserted on index 2 to remain the sort order.
arr = np.array([6, 7, 8, 9])
x = np.searchsorted(arr, 7, side='right')
print(x)

2


#### Search Multiple Values

In [31]:
# The return value is an array: [1 2 3] containing the three indexes where 2, 4, 6 would be inserted in the original array to maintain the order.
arr = np.array([1, 3, 5, 7])
x = np.searchsorted(arr, [2, 4, 6])
print(x)

[1 2 3]


## Sorting Arrays

In [39]:
arr = np.array([3, 2, 0, 1])
print(np.sort(arr))  # This method returns a copy of the array, leaving the original array unchanged.

arr = np.array(['banana', 'cherry', 'apple'])
print(np.sort(arr))

arr = np.array([True, False, True])

print(np.sort(arr))

arr = np.array([3, 2, 0, 1, 2.5, 'banana', 'cherry', 'apple', True, False, True, False])
print(np.sort(arr))  # Order: number, boolean, string. Note: False = 0, True = 1, therefore, False comes prior to True.

[0 1 2 3]
['apple' 'banana' 'cherry']
[False  True  True]
['0' '1' '2' '2.5' '3' 'False' 'False' 'True' 'True' 'apple' 'banana'
 'cherry']


In [40]:
# If you use the sort() method on a 2-D array, both arrays will be sorted.
arr = np.array([[3, 2, 4], [5, 0, 1]])
print(np.sort(arr))

[[2 3 4]
 [0 1 5]]


## Filtering Arrays

Getting some elements out of an existing array and creating a new array out of them is called filtering.

* A boolean index list is a list of booleans corresponding to indexes in the array.
* If the value at an index is `True` that element is contained in the filtered array, if the value at that index is `False` that element is excluded from the filtered array.

In [41]:
arr = np.array([41, 42, 43, 44])
x = [True, False, True, False]
new_arr = arr[x]  # Syntax: root_array[filter_list]
print(new_arr)

[41 43]


* The example above will return `[41, 43]`, why?
  * Because the new array contains only the values where the filter array had the value `True`, in this case, index `0` and `2`.

In [42]:
# Create a filter array that will return only values higher than 42.
arr = np.array([41, 42, 43, 44])

# Create an empty list
filter_arr = []

# go through each element in arr
for element in arr:
  # if the element is higher than 42, set the value to True, otherwise False:
  if element > 42:
    filter_arr.append(True)
  else:
    filter_arr.append(False)

new_arr = arr[filter_arr]
print(filter_arr)
print(new_arr)

[False, False, True, True]
[43 44]


### Creating Filter Directly From Array

In [43]:
# Create a filter array that will return only values higher than 42.
arr = np.array([41, 42, 43, 44])
filter_arr = arr > 42
print(filter_arr)
new_arr = arr[filter_arr]
print(new_arr)

[False False  True  True]
[False False  True  True]
[43 44]


In [46]:
# Create a filter array that will return only even elements from the original array.
arr = np.array([1, 2, 3, 4, 5, 6, 7])
new_arr = arr[arr % 2 == 0]
print(new_arr)

[2 4 6]
