In [2]:
import numpy as np

# NpArray Functions


## Copy vs View


The main difference between a copy and a view of an array is that the copy is a new array, and the view is just a view of the original array.

The copy owns the data and any changes made to the copy will not affect original array, and any changes made to the original array will not affect the copy.

The view does not own the data and any changes made to the view will affect the original array, and any changes made to the original array will affect the view.

In [3]:
# COPY

arr = np.array([1, 2, 3, 4, 5])
x = arr.copy()
arr[0] = 42

print(arr)
print(x) 

[42  2  3  4  5]
[1 2 3 4 5]


In [4]:
# VIEW

arr = np.array([1, 2, 3, 4, 5])
x = arr.view()
arr[0] = 42

print(arr)
print(x) 

[42  2  3  4  5]
[42  2  3  4  5]


Check if Array Owns its Data

As mentioned above, copies owns the data, and views does not own the data, but how can we check this?

Every NumPy array has the attribute base that returns None if the array owns the data.

Otherwise, the base  attribute refers to the original object. 

In [5]:
arr = np.array([1, 2, 3, 4, 5])

x = arr.copy()
y = arr.view()

print(x.base)
print(y.base) 

None
[1 2 3 4 5]


## Array Shape

The shape of an array is the number of elements in each dimension.

In [6]:
arr = np.array([[1, 2, 3, 4], [5, 6, 7, 8]])

print(arr.shape) 

(2, 4)


In [7]:
arr = np.array([1, 2, 3, 4], ndmin=5)

print(arr)
print('shape of array :', arr.shape) 

[[[[[1 2 3 4]]]]]
shape of array : (1, 1, 1, 1, 4)


## Array Reshape

Reshaping means changing the shape of an array.

The shape of an array is the number of elements in each dimension.

By reshaping we can add or remove dimensions or change number of elements in each dimension.

In [8]:
arr = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12])

newarr = arr.reshape(2, 3, 2)

print(newarr) 

[[[ 1  2]
  [ 3  4]
  [ 5  6]]

 [[ 7  8]
  [ 9 10]
  [11 12]]]


Flattening array means converting a multidimensional array into a 1D array.

We can use reshape(-1) to do this.

In [9]:
arr = np.array([[1, 2, 3], [4, 5, 6]])

newarr = arr.reshape(-1)

print(newarr) 

[1 2 3 4 5 6]


## Array Iterating

Iterating means going through elements one by one.

As we deal with multi-dimensional arrays in numpy, we can do this using basic for loop of python.

If we iterate on a 1-D array it will go through each element one by one.

In [10]:
arr = np.array([1, 2, 3])

for x in arr:
  print(x) 

1
2
3


In [None]:
arr = np.array([[1, 2, 3], [4, 5, 6]]) # 2-D

for x in arr:
  print(x) 

[1 2 3]
[4 5 6]


In [12]:
# Iterate on each scalar element of the 2-D array: 
arr = np.array([[1, 2, 3], [4, 5, 6]])

for x in arr:
  for y in x:
    print(y) 

1
2
3
4
5
6


### Iterating using nditer()

The function nditer() is a helping function that can be used from very basic to very advanced iterations. It solves some basic issues which we face in iteration, lets go through it with examples. 

In [14]:
arr = np.array([[[1, 2], [3, 4]], [[5, 6], [7, 8]]]) #3-D array

for x in np.nditer(arr):
  print(x) 

1
2
3
4
5
6
7
8


#### Iterating Array With Different Data Types

We can use op_dtypes argument and pass it the expected datatype to change the datatype of elements while iterating.

NumPy does not change the data type of the element in-place (where the element is in array) so it needs some other space to perform this action, that extra space is called buffer, and in order to enable it in nditer() we pass flags=['buffered'].

In [15]:
arr = np.array([1, 2, 3])

for x in np.nditer(arr, flags=['buffered'], op_dtypes=['S']):
  print(x) 

np.bytes_(b'1')
np.bytes_(b'2')
np.bytes_(b'3')


#### Iterating With Different Step Size

In [16]:
arr = np.array([[1, 2, 3, 4], [5, 6, 7, 8]])

for x in np.nditer(arr[:, ::2]):
  print(x) 

1
3
5
7


### Enumerated Iteration Using ndenumerate()

Enumeration means mentioning sequence number of somethings one by one.

Sometimes we require corresponding index of the element while iterating, the ndenumerate() method can be used for those usecases.

In [17]:
arr = np.array([1, 2, 3])

for idx, x in np.ndenumerate(arr):
  print(idx, x) 

(0,) 1
(1,) 2
(2,) 3


In [18]:
arr = np.array([[1, 2, 3, 4], [5, 6, 7, 8]])

for idx, x in np.ndenumerate(arr):
  print(idx, x) 

(0, 0) 1
(0, 1) 2
(0, 2) 3
(0, 3) 4
(1, 0) 5
(1, 1) 6
(1, 2) 7
(1, 3) 8


## Array Join

Joining means putting contents of two or more arrays in a single array.

In SQL we join tables based on a key, whereas in NumPy we join arrays by axes.

We pass a sequence of arrays that we want to join to the concatenate() function, along with the axis. If axis is not explicitly passed, it is taken as 0.

In [19]:
arr1 = np.array([1, 2, 3])

arr2 = np.array([4, 5, 6])

arr = np.concatenate((arr1, arr2))

print(arr) 

[1 2 3 4 5 6]


In [20]:
arr1 = np.array([[1, 2], [3, 4]])

arr2 = np.array([[5, 6], [7, 8]])

arr = np.concatenate((arr1, arr2), axis=1)

print(arr) 

[[1 2 5 6]
 [3 4 7 8]]


#### Joining Arrays Using Stack Functions
Stacking is same as concatenation, the only difference is that stacking is done along a new axis.

We can concatenate two 1-D arrays along the second axis which would result in putting them one over the other, ie. stacking.

We pass a sequence of arrays that we want to join to the stack() method along with the axis. If axis is not explicitly passed it is taken as 0.

In [21]:
arr1 = np.array([1, 2, 3])

arr2 = np.array([4, 5, 6])

arr = np.stack((arr1, arr2), axis=1)

print(arr) 

[[1 4]
 [2 5]
 [3 6]]


#### Stacking along Rows
NumPy provides a helper function: hstack() to stack along rows.

In [22]:
arr1 = np.array([1, 2, 3])

arr2 = np.array([4, 5, 6])

arr = np.hstack((arr1, arr2))

print(arr) 

[1 2 3 4 5 6]


#### Stacking Along Columns
NumPy provides a helper function: vstack()  to stack along columns.

In [23]:
arr1 = np.array([1, 2, 3])

arr2 = np.array([4, 5, 6])

arr = np.vstack((arr1, arr2))

print(arr) 

[[1 2 3]
 [4 5 6]]


#### Stacking Along Height
NumPy provides a helper function: dstack() to stack along height, which is the same as depth.

In [24]:
arr1 = np.array([1, 2, 3])

arr2 = np.array([4, 5, 6])

arr = np.dstack((arr1, arr2))

print(arr) 

[[[1 4]
  [2 5]
  [3 6]]]


## Array Split

Splitting is reverse operation of Joining.

Joining merges multiple arrays into one and Splitting breaks one array into multiple.

We use array_split() for splitting arrays, we pass it the array we want to split and the number of splits.

In [25]:
arr = np.array([1, 2, 3, 4, 5, 6])

newarr = np.array_split(arr, 3)

print(newarr) 

[array([1, 2]), array([3, 4]), array([5, 6])]


In [26]:
arr = np.array([1, 2, 3, 4, 5, 6])

newarr = np.array_split(arr, 4)

print(newarr) 

[array([1, 2]), array([3, 4]), array([5]), array([6])]


The return value of the array_split() method is a list containing each of the split as an array.

If you split an array into 3 arrays, you can access them from the result just like any array element

In [27]:
# Accessing splitted arrays

arr = np.array([1, 2, 3, 4, 5, 6])

newarr = np.array_split(arr, 3)

print(newarr[0])
print(newarr[1])
print(newarr[2]) 

[1 2]
[3 4]
[5 6]


alternates to vstack() and dstack() are available as vsplit() and dsplit() so on and so forth

## Array Search

search an array for a certain value, and return the indexes that get a match.

To search an array, use the where() method.

In [29]:
arr = np.array([1, 2, 3, 4, 5, 4, 4])

x = np.where(arr == 4)

print(x) # Returns a tuple containing the location of the searching number

(array([3, 5, 6]),)


In [30]:
arr = np.array([1, 2, 3, 4, 5, 6, 7, 8])

x = np.where(arr%2 == 0) # indexes where values are even

print(x) 

(array([1, 3, 5, 7]),)


### Search Sorted

There is a method called searchsorted() which performs a binary search in the array, and returns the index where the specified value would be inserted to maintain the search order.

In [31]:
arr = np.array([6, 7, 8, 9])

x = np.searchsorted(arr, 7)

print(x) 

1


#### Search from the Right Side

By default the left most index is returned, but we can give side='right' to return the right most index instead.

In [32]:
arr = np.array([6, 7, 8, 9])

x = np.searchsorted(arr, 7, side='right')

print(x) 

2


#### Multiple Values

In [33]:
arr = np.array([1, 3, 5, 7])

x = np.searchsorted(arr, [2, 4, 6])

print(x) 

[1 2 3]


## Array Sorting

Sorting means putting elements in an ordered sequence.

Ordered sequence is any sequence that has an order corresponding to elements, like numeric or alphabetical, ascending or descending.

The NumPy ndarray object has a function called sort(), that will sort a specified array.

In [34]:
arr = np.array([3, 2, 0, 1])

print(np.sort(arr)) 

[0 1 2 3]


In [35]:
arr = np.array(['banana', 'cherry', 'apple'])

print(np.sort(arr)) 

['apple' 'banana' 'cherry']


In [36]:
arr = np.array([True, False, True])

print(np.sort(arr)) 

[False  True  True]


In [37]:
arr = np.array([[3, 2, 4], [5, 0, 1]])

print(np.sort(arr)) 

[[2 3 4]
 [0 1 5]]


## Array Filter

Getting some elements out of an existing array and creating a new array out of them is called filtering.

In NumPy, you filter an array using a boolean index list.

A boolean index list is a list of booleans corresponding to indexes in the array.

In [None]:
arr = np.array([41, 42, 43, 44])

x = [True, False, True, False]

newarr = arr[x]

print(newarr) 

[41 43]


The example above will returns [41, 43], why?

Because the new array contains only the values where the filter array had the value True, in this case, index 0 and 2.

In [40]:
# creating an array that contains only values greater than 42
arr = np.array([41, 42, 43, 44])

# Create an empty list
filter_arr = []

# go through each element in arr
for element in arr:
  # if the element is higher than 42, set the value to True, otherwise False:
  if element > 42:
    filter_arr.append(True)
  else:
    filter_arr.append(False)

newarr = arr[filter_arr]

print(filter_arr)
print(newarr) 

[False, False, True, True]
[43 44]


In [41]:
# creating an array that contains only even elements from the original array

arr = np.array([1, 2, 3, 4, 5, 6, 7])

# Create an empty list
filter_arr = []

# go through each element in arr
for element in arr:
  # if the element is higher than 42, set the value to True, otherwise False:
  if element % 2 == 0:
    filter_arr.append(True)
  else:
    filter_arr.append(False)

newarr = arr[filter_arr]

print(filter_arr)
print(newarr) 

[False, True, False, True, False, True, False]
[2 4 6]


Creating Filter Directly From Array

The above example is quite a common task in NumPy and NumPy provides a nice way to tackle it.

We can directly substitute the array instead of the iterable variable in our condition and it will work just as we expect it to.

In [42]:
arr = np.array([41, 42, 43, 44])

filter_arr = arr > 42

newarr = arr[filter_arr]

print(filter_arr)
print(newarr) 

[False False  True  True]
[43 44]


In [43]:
arr = np.array([1, 2, 3, 4, 5, 6, 7])

filter_arr = arr % 2 == 0

newarr = arr[filter_arr]

print(filter_arr)
print(newarr) 

[False  True False  True False  True False]
[2 4 6]
