# 8. NumPy Array Shape

## Shape of an Array

The shape of an array is the number of elements in each dimension.

## Get the Shape of an Array

NumPy arrays have an attribute called ```shape``` that returns a tuple with each index having the number of corresponding elements.

In [1]:
import numpy as np

arr = np.array([[1, 2, 3, 4], [5, 6, 7, 8]])

print(arr.shape) 

(2, 4)


In [3]:
arr = np.array([1, 2, 3, 4], ndmin=5)

print(arr)
print('shape of array :', arr.shape) 

[[[[[1 2 3 4]]]]]
shape of array : (1, 1, 1, 1, 4)


### What does the shape tuple represent?

Integers at every index tells about the number of elements the corresponding dimension has.

In the example above at index-4 we have value 4, so we can say that 5th ( 4 + 1 th) dimension has 4 elements.

## 9. NumPy Array Reshaping

### Reshaping arrays

Reshaping means **changing the shape** of an array.

The shape of an array is the number of elements in each dimension.

By reshaping we can add or remove dimensions or change number of elements in each dimension.

### Reshape From 1-D to 2-D

In [6]:
# Convert the following 1-D array with 12 elements into a 2-D array.

import numpy as np

arr = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12])

newarr = arr.reshape(4, 3)

print(newarr)
print(arr.shape) 
print(newarr.shape) 

[[ 1  2  3]
 [ 4  5  6]
 [ 7  8  9]
 [10 11 12]]
(12,)
(4, 3)


In [8]:
# Convert the following 1-D array with 12 elements into a 3-D array.

import numpy as np

arr = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12])

newarr = arr.reshape(2, 3, 2)

print(newarr)
print(newarr.shape)

[[[ 1  2]
  [ 3  4]
  [ 5  6]]

 [[ 7  8]
  [ 9 10]
  [11 12]]]
(2, 3, 2)


### Can We Reshape Into any Shape?

Yes, as long as the **elements required for reshaping are equal in both shapes**.

We can reshape an 8 elements 1D array into 4 elements in 2 rows 2D array but we cannot reshape it into a 3 elements 3 rows 2D array as that would require 3x3 = 9 elements.

### Returns Copy or View?

In [11]:
import numpy as np

arr = np.array([1, 2, 3, 4, 5, 6, 7, 8])

print(arr.reshape(2, 4).base) # view

[1 2 3 4 5 6 7 8]


### Unknown Dimension

You are allowed to have one "unknown" dimension.

Meaning that you do not have to specify an exact number for one of the dimensions in the reshape method.

Pass ```-1``` as the value, and NumPy will calculate this number for you.


In [13]:
import numpy as np

arr = np.array([1, 2, 3, 4, 5, 6, 7, 8])

newarr = arr.reshape(2, 2, -1)

print(newarr)
print(newarr.shape)

[[[1 2]
  [3 4]]

 [[5 6]
  [7 8]]]
(2, 2, 2)


### Flattening the arrays

Flattening array means converting a multidimensional array into a 1D array.

We can use ```reshape(-1)``` to do this.

In [14]:
import numpy as np

arr = np.array([[1, 2, 3], [4, 5, 6]])

newarr = arr.reshape(-1)

print(newarr) 

[1 2 3 4 5 6]


**Note**: There are a lot of functions for changing the shapes of arrays in numpy ```flatten```, ```ravel``` and also for rearranging the elements ```rot90```, ```flip```, ```fliplr```, ```flipud``` etc. These fall under Intermediate to Advanced section of numpy.

## 10. NumPy Array Iterating

### Iterating Arrays

Iterating means going through elements one by one.

As we deal with multi-dimensional arrays in numpy, we can do this using basic **for loop** of python.

If we iterate on a 1-D array it will go through each element one by one.

In [16]:
import numpy as np

arr = np.array([1, 2, 3])

for x in arr:
    print(x) 

1
2
3


In [17]:
# Iterate on each scalar element of the 2-D array:
import numpy as np

arr = np.array([[1, 2, 3], [4, 5, 6]])

for x in arr:
    for y in x:
        print(y) 

1
2
3
4
5
6


In [18]:
# Iterate on the elements of the following 3-D array:
import numpy as np

arr = np.array([[[1, 2, 3], [4, 5, 6]], [[7, 8, 9], [10, 11, 12]]])

for x in arr:
    print(x) 

[[1 2 3]
 [4 5 6]]
[[ 7  8  9]
 [10 11 12]]


In [19]:
# Iterate down to the scalars:
import numpy as np

arr = np.array([[[1, 2, 3], [4, 5, 6]], [[7, 8, 9], [10, 11, 12]]])

for x in arr:
    for y in x:
        for z in y:
            print(z) 

1
2
3
4
5
6
7
8
9
10
11
12


### Iterating Arrays Using nditer()

The function ```nditer()``` is a helping function that can be used from very basic to very advanced iterations. It solves some basic issues which we face in iteration, lets go through it with examples.

#### Iterating on Each Scalar Element

In basic ```for``` loops, iterating through each scalar of an array we need to use ```n for``` loops which can be difficult to write for arrays with very high dimensionality.

In [21]:
import numpy as np

arr = np.array([[[1, 2], [3, 4]], [[5, 6], [7, 8]]])

for x in np.nditer(arr):
    print(x) 

1
2
3
4
5
6
7
8


### Iterating Array With Different Data Types

We can use ```op_dtypes``` argument and pass it the expected datatype to change the datatype of elements while iterating.

NumPy does not change the data type of the element in-place (where the element is in array) so it needs some other space to perform this action, that extra space is called buffer, and in order to enable it in ```nditer()``` we pass ```flags=['buffered']```.

In [22]:
import numpy as np

arr = np.array([1, 2, 3])

for x in np.nditer(arr, flags=['buffered'], op_dtypes=['S']):
    print(x) 

b'1'
b'2'
b'3'


### Iterating With Different Step Size

We can use filtering and followed by iteration.

In [23]:
import numpy as np

arr = np.array([[1, 2, 3, 4], [5, 6, 7, 8]])

for x in np.nditer(arr[:, ::2]):
    print(x)

1
3
5
7


###  Enumerated Iteration Using ndenumerate()

Enumeration means mentioning sequence number of somethings one by one.

Sometimes we require corresponding index of the element while iterating, the ```ndenumerate()``` method can be used for those usecases.

In [24]:
import numpy as np

arr = np.array([1, 2, 3])

for idx, x in np.ndenumerate(arr):
    print(idx, x) 

(0,) 1
(1,) 2
(2,) 3


In [25]:
import numpy as np

arr = np.array([[1, 2, 3, 4], [5, 6, 7, 8]])

for idx, x in np.ndenumerate(arr):
    print(idx, x) 

(0, 0) 1
(0, 1) 2
(0, 2) 3
(0, 3) 4
(1, 0) 5
(1, 1) 6
(1, 2) 7
(1, 3) 8


## 11. NumPy Joining Array

### Joining NumPy Arrays

Joining means putting contents of two or more arrays in a single array.

In SQL we join tables based on a key, whereas in NumPy we join arrays by axes.

We pass a sequence of arrays that we want to join to the ```concatenate()``` function, along with the axis. If axis is not explicitly passed, it is taken as 0.

In [26]:
# Join two arrays
import numpy as np

arr1 = np.array([1, 2, 3])
arr2 = np.array([4, 5, 6])
arr = np.concatenate((arr1, arr2))

print(arr) 

[1 2 3 4 5 6]


In [27]:
# Join two 2-D arrays along rows (axis=1):
import numpy as np

arr1 = np.array([[1, 2], [3, 4]])
arr2 = np.array([[5, 6], [7, 8]])
arr = np.concatenate((arr1, arr2), axis=1)

print(arr) 

[[1 2 5 6]
 [3 4 7 8]]


### Joining Arrays Using Stack Functions

Stacking is same as concatenation, the only difference is that stacking is done along a new axis.

We can concatenate two 1-D arrays along the second axis which would result in putting them one over the other, ie. stacking.

We pass a sequence of arrays that we want to join to the ```stack()``` method along with the axis. If axis is not explicitly passed it is taken as 0.

In [28]:
import numpy as np

arr1 = np.array([1, 2, 3])
arr2 = np.array([4, 5, 6])
arr = np.stack((arr1, arr2), axis=1)

print(arr) 

[[1 4]
 [2 5]
 [3 6]]


### Stacking Along Rows

NumPy provides a helper function: ```hstack()``` to stack along rows.

In [29]:
import numpy as np

arr1 = np.array([1, 2, 3])
arr2 = np.array([4, 5, 6])
arr = np.hstack((arr1, arr2))

print(arr) 

[1 2 3 4 5 6]


### Stacking Along Columns

NumPy provides a helper function: ```vstack()``` to stack along columns.

In [30]:
import numpy as np

arr1 = np.array([1, 2, 3])
arr2 = np.array([4, 5, 6])
arr = np.vstack((arr1, arr2))

print(arr) 

[[1 2 3]
 [4 5 6]]


### Stacking Along Height (depth)

NumPy provides a helper function: ```dstack()``` to stack along height, which is the same as depth.

In [31]:
import numpy as np

arr1 = np.array([1, 2, 3])
arr2 = np.array([4, 5, 6])
arr = np.dstack((arr1, arr2))

print(arr) 

[[[1 4]
  [2 5]
  [3 6]]]


## 12. NumPy Splitting Array

### Splitting NumPy Arrays

Splitting is reverse operation of Joining.

Joining merges multiple arrays into one and Splitting breaks one array into multiple.

We use ```array_split()``` for splitting arrays, we pass it the array we want to split and the number of splits.

In [33]:
# Split the array in 3 parts:
import numpy as np

arr = np.array([1, 2, 3, 4, 5, 6])
newarr = np.array_split(arr, 3)

print(newarr) # The return value is an array containing three arrays.

[array([1, 2]), array([3, 4]), array([5, 6])]


In [35]:
# Split the array in 4 parts:
# If the array has less elements than required, it will adjust from the end accordingly.
import numpy as np

arr = np.array([1, 2, 3, 4, 5, 6])
newarr = np.array_split(arr, 4)

print(newarr) 

[array([1, 2]), array([3, 4]), array([5]), array([6])]


### Split Into Arrays

The return value of the ```array_split()``` method is an array containing each of the split as an array.

If you split an array into 3 arrays, you can access them from the result just like any array element:

In [39]:
# Access the splitted arrays:
import numpy as np

arr = np.array([1, 2, 3, 4, 5, 6])

newarr = np.array_split(arr, 3)

print(newarr)
print()
print(newarr[0])
print(newarr[1])
print(newarr[2]) 

[array([1, 2]), array([3, 4]), array([5, 6])]

[1 2]
[3 4]
[5 6]


## Splitting 2-D Arrays

Use the same syntax when splitting 2-D arrays.

Use the ```array_split()``` method, pass in the array you want to split and the number of splits you want to do.

In [40]:
# Split the 2-D array into three 2-D arrays.
import numpy as np

arr = np.array([[1, 2], [3, 4], [5, 6], [7, 8], [9, 10], [11, 12]])
newarr = np.array_split(arr, 3)

print(newarr) 

[array([[1, 2],
       [3, 4]]), array([[5, 6],
       [7, 8]]), array([[ 9, 10],
       [11, 12]])]


The example above returns three 2-D arrays.

In addition, you can specify which axis you want to do the split around.

The example below also returns three 2-D arrays, but they are split along the row (axis=1).

In [42]:
# Split the 2-D array into three 2-D arrays along rows.
import numpy as np

arr = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9], [10, 11, 12], [13, 14, 15], [16, 17, 18]])

newarr = np.array_split(arr, 3, axis=1)

print(newarr) 

[array([[ 1],
       [ 4],
       [ 7],
       [10],
       [13],
       [16]]), array([[ 2],
       [ 5],
       [ 8],
       [11],
       [14],
       [17]]), array([[ 3],
       [ 6],
       [ 9],
       [12],
       [15],
       [18]])]


An alternate solution is using ```hsplit()``` opposite of ```hstack()```.

**Note**: Similar alternates to ```vstack()``` and ```dstack()``` are available as ```vsplit()``` and ```dsplit()```.

## 13. NumPy Searching Arrays

You can search an array for a certain value, and return the indexes that get a match.

To search an array, use the ```where()``` method.

In [3]:
# Find the indexes where the value is 4:
import numpy as np

arr = np.array([1, 2, 3, 4, 5, 4, 4])

x = np.where(arr == 4)

print(x) # tuple: Which means that the value 4 is present at index 3, 5, and 6

(array([3, 5, 6]),)


In [4]:
# Find the indexes where the values are even:
import numpy as np

arr = np.array([1, 2, 3, 4, 5, 6, 7, 8])

x = np.where(arr%2 == 0)

print(x) 

(array([1, 3, 5, 7]),)


### Search Sorted

There is a method called ```searchsorted()``` which performs a **binary search** in the array, and returns the index where the specified value would be inserted to maintain the search order.

The ```searchsorted()``` method is assumed to be used on **sorted arrays**.

In [5]:
import numpy as np

arr = np.array([6, 7, 8, 9])

x = np.searchsorted(arr, 7)

print(x) 

1


In [10]:
# Find the indexes where the value 7 should be inserted:
import numpy as np

arr = np.array([6, 7, 8, 9])

x = np.searchsorted(arr, 7)

print(x) # The number 7 should be inserted on index 1 to remain the sort order.

1


#### Search From the Right Side

By default the left most index is returned, but we can give ```side='right'``` to return the right most index instead.

In [11]:
# Find the indexes where the value 7 should be inserted, starting from the right:
import numpy as np

arr = np.array([6, 7, 8, 9])

x = np.searchsorted(arr, 7, side='right')

print(x) 

2


#### Multiple Values

To search for more than one value, use an array with the specified values.

In [13]:
import numpy as np

arr = np.array([1, 3, 5, 7])

x = np.searchsorted(arr, [2, 4, 6])

print(x) # The return value: inserted in the original array to maintain the order.

[1 2 3]


## 14. NumPy Sorting Arrays

### Sorting Arrays

Sorting means putting elements in an **ordered sequence**.

Ordered sequence is any sequence that has an order corresponding to elements, like numeric or alphabetical, ascending or descending.

The NumPy ndarray object has a function called ```sort()```, that will sort a specified array.

In [17]:
import numpy as np

arr = np.array([3, 2, 0, 1])

print(np.sort(arr))
print(np.sort(arr).base) 

[0 1 2 3]
None


**Note**: This method returns a copy of the array, leaving the original array unchanged.

You can also sort arrays of strings, or any other data type:

In [18]:
# Sort the array alphabetically:
import numpy as np

arr = np.array(['banana', 'cherry', 'apple'])

print(np.sort(arr))

['apple' 'banana' 'cherry']


In [19]:
# Sort a boolean array:
import numpy as np

arr = np.array([True, False, True])

print(np.sort(arr)) 

[False  True  True]


### Sorting a 2-D Array

If you use the ```sort()``` method on a 2-D array, both arrays will be sorted:

In [20]:
# Sort a 2-D array:
import numpy as np

arr = np.array([[3, 2, 4], [5, 0, 1]])

print(np.sort(arr))

[[2 3 4]
 [0 1 5]]


## 15. NumPy Filter Array

### Filtering Arrays

**Getting some elements out** of an existing array and **creating a new array** out of them is called filtering.

In NumPy, you filter an array using a *boolean index list*.

A boolean index list is a list of booleans corresponding to indexes in the array.

If the value at an index is ```True``` that element is contained in the filtered array, if the value at that index is ```False``` that element is excluded from the filtered array.

In [22]:
# Create an array from the elements on index 0 and 2:
import numpy as np

arr = np.array([41, 42, 43, 44])

x = [True, False, True, False]

newarr = arr[x]

print(newarr) # the new filter contains only the values where the filter array had the value True

[41 43]


### Creating the Filter Array

In the example above we hard-coded the ```True``` and ```False``` values, but the common use is to create a filter array based on conditions.

In [23]:
# Create a filter array that will return only values higher than 42:
import numpy as np

arr = np.array([41, 42, 43, 44])

# Create an empty list
filter_arr = []

# go through each element in arr
for element in arr:
  # if the element is higher than 42, set the value to True, otherwise False:
    if element > 42:
        filter_arr.append(True)
    else:
        filter_arr.append(False)

newarr = arr[filter_arr]

print(filter_arr)
print(newarr) 

[False, False, True, True]
[43 44]


In [24]:
# Create a filter array that will return only even elements from the original array:
import numpy as np

arr = np.array([1, 2, 3, 4, 5, 6, 7])

# Create an empty list
filter_arr = []

# go through each element in arr
for element in arr:
  # if the element is completely divisble by 2, set the value to True, otherwise False
    if element % 2 == 0:
        filter_arr.append(True)
    else:
        filter_arr.append(False)

newarr = arr[filter_arr]

print(filter_arr)
print(newarr) 

[False, True, False, True, False, True, False]
[2 4 6]


### Creating Filter Directly From Array

The above example is quite a common task in NumPy and NumPy provides a nice way to tackle it.

We can directly substitute the array instead of the iterable variable in our condition and it will work just as we expect it to.

In [25]:
# Create a filter array that will return only values higher than 42:
import numpy as np

arr = np.array([41, 42, 43, 44])

filter_arr = arr > 42

newarr = arr[filter_arr]

print(filter_arr)
print(newarr) 

[False False  True  True]
[43 44]


In [26]:
# Create a filter array that will return only even elements from the original array:
import numpy as np

arr = np.array([1, 2, 3, 4, 5, 6, 7])

filter_arr = arr % 2 == 0

newarr = arr[filter_arr]

print(filter_arr)
print(newarr) 

[False  True False  True False  True False]
[2 4 6]
