# Chapter-10 - Joining of multiple arrays into a single array

In [3]:
from IPython.display import Image

**Joining of multiple arrays into a single array**

It is similar to Join queries in Oracle.<br>

We can join/concatenate multiple ndarrays into a single array by using the following functions.<br>

• 1. concatenate()<br>
• 2. stack()<br>
• 3. vstack()<br>
• 4. hstack()<br>
• 5. dstack()<br>

## concatenate()

In [None]:
import numpy as np
help(np.concatenate)

**Syntax:** concatenate(...)<br>

• concatenate((a1, a2, ...), axis=0, out=None, dtype=None, casting="same_kind")<br>
• Join a sequence of arrays along an existing axis.<br>

**(a1, a2,..) ==> input arrays along an existing axis**<br>
**axis ==> based on which axis we have to perform concatination**<br>
&emsp; • axis=0(default) :: vertical concatination will happens<br>
&emsp; • axis=1 :: Horizontal concatination will happens<br>
&emsp; • axis=None :: First the arrays will be flatten(converted to 1-D array) and then concatination will be performed on the resultant arrays<br>
**out ==> destination array, where we have to store concatenation result.**<br>

### Use of axis parameter

<img src="concatinate.JPG" alt="concatinate" width="700" height="700">

In [1]:
# 1-D array
import numpy as np
a = np.arange(4)
b = np.arange(5)
c = np.arange(3)
np.concatenate((a,b,c))

array([0, 1, 2, 3, 0, 1, 2, 3, 4, 0, 1, 2])

In [6]:
# 2-D array
import numpy as np
a = np.array([[1,2],[3,4]])
b = np.array([[5,6],[7,8]])

# concatination by providing axis paraameter
# Vertical Concatenation
vcon = np.concatenate((a,b))
vcon1 = np.concatenate((a,b),axis=0)

# Horizontal Concateation
hcon = np.concatenate((a,b),axis=1)

# flatten and then concatenation
flatt = np.concatenate((a,b),axis=None)
print(f"array a ==> \n {a}")
print(f"array b ==> \n {b}")
print(f"Without specifying axis parameter ==> \n {vcon}")
print(f"Specifying axis=0 a ==> \n {vcon1}")
print(f"Specifying axis=1 ==> \n {hcon}")
print(f"Specifying axis=None ==> \n {flatt}")

array a ==> 
 [[1 2]
 [3 4]]
array b ==> 
 [[5 6]
 [7 8]]
Without specifying axis parameter ==> 
 [[1 2]
 [3 4]
 [5 6]
 [7 8]]
Specifying axis=0 a ==> 
 [[1 2]
 [3 4]
 [5 6]
 [7 8]]
Specifying axis=1 ==> 
 [[1 2 5 6]
 [3 4 7 8]]
Specifying axis=None ==> 
 [1 2 3 4 5 6 7 8]


**Rules**:<br>

• We can join any number of arrays, but all arrays should be of same dimension.<br>
• The sizes of all axes, except concatenation axis should be same.<br>
• The shapes of resultant array and out array must be same.<br>

<img src="concatinate_1.JPG" alt="concatinate_1" width="700" height="700">

In [9]:
# Rule-2 demonstration
import numpy as np
a = np.arange(6).reshape(2,3)
b = np.arange(15).reshape(5,3)
print(f"array a ==> \n {a}")
print(f"array b ==> \n {b}")

# axis=0 ==> Vertical concatenation
vcon = np.concatenate((a,b),axis=0)
print(f"Vertical Concatenation array ==> \n{vcon}")

# axis=1 ==> Horizontal Concatenation
hcon = np.concatenate((a,b),axis=1)
print(f"Horizontal Concatenation array ==> \n{hcon}")

array a ==> 
 [[0 1 2]
 [3 4 5]]
array b ==> 
 [[ 0  1  2]
 [ 3  4  5]
 [ 6  7  8]
 [ 9 10 11]
 [12 13 14]]
Vertical Concatenation array ==> 
[[ 0  1  2]
 [ 3  4  5]
 [ 0  1  2]
 [ 3  4  5]
 [ 6  7  8]
 [ 9 10 11]
 [12 13 14]]


ValueError: all the input array dimensions except for the concatenation axis must match exactly, but along dimension 0, the array at index 0 has size 2 and the array at index 1 has size 5

### Storing the result using 'out' parameter

• concatenate(...)<br>
• concatenate((a1, a2, ...), axis=0, out=None, dtype=None, casting="same_kind")<br>
• Join a sequence of arrays along an existing axis.<br>
• we can store result in an array after concatenation using 'out' parameter, but the result and out must be in same shape<br>

In [12]:
# example for out parameter
import numpy as np
a = np.arange(4)
b = np.arange(5)
c = np.empty(9) # default dtype for empty is float
np.concatenate((a,b),out=c)
c

array([0., 1., 2., 3., 0., 1., 2., 3., 4.])

In [13]:
# if the shape of result and out differs then we will get error : ValueError
import numpy as np
a = np.arange(4)
b = np.arange(5)
c = np.empty(10) # default dtype is float
np.concatenate((a,b),out=c)

ValueError: Output array is the wrong shape

### using 'dtype' parameter
• we can specifyt the required data type using dtype parameter

**Note:**<br>
• We can use either out or dtype .<br>
• We cannot use both out and dtype simultaneously because out has its own data type

In [14]:
# Demo for dtype parameter
import numpy as np
a = np.arange(4)
b = np.arange(5)
np.concatenate((a,b),dtype=str)

array(['0', '1', '2', '3', '0', '1', '2', '3', '4'], dtype='<U11')

In [15]:
# Demo for both out dtype parameter

import numpy as np
a = np.arange(4)
b = np.arange(5)
c = np.empty(9,dtype=str)
np.concatenate((a,b),out=c,dtype=str)

TypeError: concatenate() only takes `out` or `dtype` as an argument, but both were provided.

### Concatenation of 1-D arrays

• we can concatenate any number of 1-D arrays at a time<br>
• For 1-D arrays there exists only one axis i.e., axis-0

In [16]:
# Demo for concatenation of three 1-D arrays
a = np.arange(4)
b = np.arange(5)
c = np.arange(3)
np.concatenate((a,b,c),dtype=int)

array([0, 1, 2, 3, 0, 1, 2, 3, 4, 0, 1, 2])

### Concatenation of 2-D arrays

• we can concatenate any number of 2-D arrays at a time<br>
• For 2-D arrays there exists two axes i.e., axis-0 and axis-1<br>

  - axis-0 ==> represents number of rows<br>
  - axis-1 ==> represents number of columns<br>

• we can perform concatenation either axis-0 or axis-1<br>
• size of all dimensions(axes) must be matched except concatenation axis.

<img src="concatinate_2.JPG" alt="concatinate_2" width="700" height="700">

In [18]:
# Demo for concatenation of two 2-D arrays
import numpy as np
a = np.array([[10,20],[30,40],[50,60]])
b = np.array([[70,80],[90,100]])
print(f"a array :: \n {a}")
print(f"b array :: \n {b}")
print(f"a shape : {a.shape}")
print(f"b shape : {b.shape}")
# concatenation on axis=0 ==> Vertical concatenation
vcon = np.concatenate((a,b),axis=0)
print(f"Concatenation based on axis-0(vertical) :: \n {vcon}")
# concatenation on axis=0 ==> Horizontal concatenation
hcon = np.concatenate((a,b),axis=1)
print(f"Concatenation based on axis-1(vertical) :: \n {hcon}")

a array :: 
 [[10 20]
 [30 40]
 [50 60]]
b array :: 
 [[ 70  80]
 [ 90 100]]
a shape : (3, 2)
b shape : (2, 2)
Concatenation based on axis-0(vertical) :: 
 [[ 10  20]
 [ 30  40]
 [ 50  60]
 [ 70  80]
 [ 90 100]]


ValueError: all the input array dimensions except for the concatenation axis must match exactly, but along dimension 0, the array at index 0 has size 3 and the array at index 1 has size 2

### Concatenation of 3-D arrays

• we can concatenate any number of 3-D arrays at a time<br>
• For 3-D arrays there exists three axes i.e., axis-0,axis-1 and axis-2<br>
  - axis-0 ==> represents number of 2-D arrays<br>
  - axis-1 ==> represents number of rows in every 2-D array<br>
  - axis-2 ==> represents number of columns in every 2-D array<br>
  
• we can perform concatenation on axis-0, axis-1, axis-2 (existing axis)<br>
• size of all dimensions(axes) must be matched except concatenation axis.

<img src="concatinate_3.JPG" alt="concatinate_3" width="700" height="700">

In [20]:
# Demo for concatenation of 3-D arrays
import numpy as np
a = np.arange(12).reshape(2,3,2)
b = np.arange(18).reshape(2,3,3)
print(f"array a : \n {a}")
print(f"array b : \n {b}")
# concatenation along axis=0 ==> not possible in this case
np.concatenate((a,b),axis=0)

array a : 
 [[[ 0  1]
  [ 2  3]
  [ 4  5]]

 [[ 6  7]
  [ 8  9]
  [10 11]]]
array b : 
 [[[ 0  1  2]
  [ 3  4  5]
  [ 6  7  8]]

 [[ 9 10 11]
  [12 13 14]
  [15 16 17]]]


ValueError: all the input array dimensions except for the concatenation axis must match exactly, but along dimension 2, the array at index 0 has size 2 and the array at index 1 has size 3

In [21]:
# concatenation along axis=1 ==> not possible in this case
np.concatenate((a,b),axis=1)

ValueError: all the input array dimensions except for the concatenation axis must match exactly, but along dimension 2, the array at index 0 has size 2 and the array at index 1 has size 3

In [22]:
# concatenation along axis=2 ==> possible in this case
np.concatenate((a,b),axis=2)

array([[[ 0,  1,  0,  1,  2],
        [ 2,  3,  3,  4,  5],
        [ 4,  5,  6,  7,  8]],

       [[ 6,  7,  9, 10, 11],
        [ 8,  9, 12, 13, 14],
        [10, 11, 15, 16, 17]]])

<img src="concatinate_4.JPG" alt="concatinate_4" width="700" height="700">

In [24]:
# Demo for concatenation of 3-D array in all axes
import numpy as np
a = np.arange(18).reshape(2,3,3)
b = np.arange(18,36).reshape(2,3,3)
print(f"array a : \n {a}")
print(f"array b : \n {b}")
print(f"array a shape: {a.shape}")
print(f"array b shape: {b.shape}")

array a : 
 [[[ 0  1  2]
  [ 3  4  5]
  [ 6  7  8]]

 [[ 9 10 11]
  [12 13 14]
  [15 16 17]]]
array b : 
 [[[18 19 20]
  [21 22 23]
  [24 25 26]]

 [[27 28 29]
  [30 31 32]
  [33 34 35]]]
array a shape: (2, 3, 3)
array b shape: (2, 3, 3)


In [25]:
# concatenation along axis-0
axis0_result = np.concatenate((a,b),axis=0)
print(f"Concatenation along axis-0 : \n {axis0_result}")
print(f"Shape of the resultant array : {axis0_result.shape}")

Concatenation along axis-0 : 
 [[[ 0  1  2]
  [ 3  4  5]
  [ 6  7  8]]

 [[ 9 10 11]
  [12 13 14]
  [15 16 17]]

 [[18 19 20]
  [21 22 23]
  [24 25 26]]

 [[27 28 29]
  [30 31 32]
  [33 34 35]]]
Shape of the resultant array : (4, 3, 3)


In [26]:
# concatenation along axis-1
axis1_result = np.concatenate((a,b),axis=1)
print(f"Concatenation along axis-0 : \n {axis1_result}")
print(f"Shape of the resultant array : {axis1_result.shape}")

Concatenation along axis-0 : 
 [[[ 0  1  2]
  [ 3  4  5]
  [ 6  7  8]
  [18 19 20]
  [21 22 23]
  [24 25 26]]

 [[ 9 10 11]
  [12 13 14]
  [15 16 17]
  [27 28 29]
  [30 31 32]
  [33 34 35]]]
Shape of the resultant array : (2, 6, 3)


In [27]:
# concatenation along axis-2
axis2_result = np.concatenate((a,b),axis=2)
print(f"Concatenation along axis-0 : \n {axis2_result}")
print(f"Shape of the resultant array : {axis2_result.shape}")

Concatenation along axis-0 : 
 [[[ 0  1  2 18 19 20]
  [ 3  4  5 21 22 23]
  [ 6  7  8 24 25 26]]

 [[ 9 10 11 27 28 29]
  [12 13 14 30 31 32]
  [15 16 17 33 34 35]]]
Shape of the resultant array : (2, 3, 6)


### Array Concatenation Rules

- **Dimension Consistency**: When concatenating, the result will have the same number of dimensions as the input arrays.
  - `1-D + 1-D = 1-D`
  - `2-D + 2-D = 2-D`
  - `3-D + 3-D = 3-D`

- **Axis-Based Concatenation**: Concatenation occurs along an existing axis.
  - All input arrays must be of the same dimension.

#### Effects of Concatenation on 2-D Arrays
- **Axis-0 Concatenation**:
  - Increases the number of 2-D arrays.
  - Rows and columns remain unchanged.
- **Axis-1 Concatenation**:
  - Increases the number of rows.
  - 2-D arrays and columns remain unchanged.
- **Axis-2 Concatenation**:
  - Increases the number of columns.
  - 2-D arrays and rows remain unchanged.

#### Special Note on Incompatible Shapes
- **Example Query**: Is it possible to concatenate arrays with shapes `(3,2,3)` and `(2,1,3)`?
  - **Answer**: Not possible to concatenate on any axis due to mismatch in dimensions.
  - However, concatenation is possible when `axis=None`. In this case, both arrays will be flattened to 1-D, and then concatenation will occur.

## stack()

- **Shape Consistency**: All input arrays must have the same shape for stacking.
- **Dimension Increase**: The resultant stacked array will have one more dimension than the input arrays.
- **Axis of Joining**: The joining of arrays is always based on a new axis created in the resultant array.

### Examples of Dimension Changes
- `1-D + 1-D = 2-D`: Stacking two 1-D arrays results in a 2-D array.
- `2-D + 2-D = 3-D`: Stacking two 2-D arrays results in a 3-D array.

In [None]:
# help on stack
import numpy as np
help(np.stack)

### Stacking 1-D array

• To use stack() method, make sure all input arrays must have same shape, otherwise we will get error.

In [29]:
import numpy as np
a = np.array([10,20,30])
b = np.array([40,50,60,70])
np.stack((a,b))

ValueError: all input arrays must have the same shape

**Stacking between 1-D arrays**<br>
• The resultant array will be one more dimension i.e., 2-D array<br>
• newly created array is 2-D and it has two axes ==> axis-0 and axis-1<br>
• so we can perform stacking on axis-0 and axis-1<br>

**Stacking along axis=0 in 1-D array**<br>
• axis-0 means stack elements of input array row wise<br>
• Read row wise from input arrays and arrange row wise in result array.

In [30]:
# stacking using axis=0
a = np.array([10,20,30])
b = np.array([40,50,60])
resultant_array = np.stack((a,b)) # default axis=0
print(f"Resultant array : \n {resultant_array}")
print(f"Resultant array shape: {resultant_array.shape}")

Resultant array : 
 [[10 20 30]
 [40 50 60]]
Resultant array shape: (2, 3)


**Stacking along axis=1 in 1-D array**<br>
• axis-1 means stack elements of input array column wise<br>
• Read row wise from input arrays and arrange column wise in result array.

In [31]:
# stacking using axis=1
a = np.array([10,20,30])
b = np.array([40,50,60])
resultant_array=np.stack((a,b),axis=1)
print(f"Resultant array : \n {resultant_array}")
print(f"Resultant array shape: {resultant_array.shape}")

Resultant array : 
 [[10 40]
 [20 50]
 [30 60]]
Resultant array shape: (3, 2)


### Stacking 2-D array

**The resultant array will be: 3-D array**<br>
3-D array shape:(x,y,z)<br>
- x==>axis-0 ====>The number of 2-D arrays<br>
- y==>axis-1 ====>The number of rows in every 2-D array<br>
- z==>axis-2 ====> The number of columns in every 2-D array<br>

axis-0 means 2-D arrays one by one<br>
axis-1 means row wise in each 2-D array<br>
axis-2 means column wise in each 2-D array<br>

<img src="stacking1.JPG" alt="stacking1" width="500" height="600">

In [33]:
# stacking of 2-D arrays ==> axis=0
# axis-0 means 2-D arrays one by one:
import numpy as np
a = np.array([[1,2,3],[4,5,6]])
b = np.array([[7,8,9],[10,11,12]])
np.stack((a,b)) # by defulat np.stack((a,b),axis=0)

array([[[ 1,  2,  3],
        [ 4,  5,  6]],

       [[ 7,  8,  9],
        [10, 11, 12]]])

<img src="stacking2.JPG" alt="stacking3" width="500" height="600">

In [36]:
# stacking of 2-D arrays ==> axis=1
# axis-1 means row wise in each 2-D array
a = np.array([[1,2,3],[4,5,6]])
b = np.array([[7,8,9],[10,11,12]])
np.stack((a,b),axis=1)

array([[[ 1,  2,  3],
        [ 7,  8,  9]],

       [[ 4,  5,  6],
        [10, 11, 12]]])

<img src="stacking3.JPG" alt="stacking3" width="600" height="600">

In [38]:
# stacking of 2-D arrays ==> axis=2
# axis-2 means column wise in each 2-D array
a = np.array([[1,2,3],[4,5,6]])
b = np.array([[7,8,9],[10,11,12]])
np.stack((a,b),axis=2)

array([[[ 1,  7],
        [ 2,  8],
        [ 3,  9]],

       [[ 4, 10],
        [ 5, 11],
        [ 6, 12]]])

In [39]:
# Demo of Stacking of Three 2-D arrays

a = np.arange(1,7).reshape(3,2)
b = np.arange(7,13).reshape(3,2)
c = np.arange(13,19).reshape(3,2)
print(f"array a :\n {a}")
print(f"array b :\n {b}")
print(f"array c :\n {c}")

# stacking along axis-0
# In 3-D array axis-0 means the number of 2-d arrays
axis0_stack = np.stack((a,b,c),axis=0)
print(f"Stacking three 2-D arrays along axis-0:\n {axis0_stack}")

array a :
 [[1 2]
 [3 4]
 [5 6]]
array b :
 [[ 7  8]
 [ 9 10]
 [11 12]]
array c :
 [[13 14]
 [15 16]
 [17 18]]
Stacking three 2-D arrays along axis-0:
 [[[ 1  2]
  [ 3  4]
  [ 5  6]]

 [[ 7  8]
  [ 9 10]
  [11 12]]

 [[13 14]
  [15 16]
  [17 18]]]


In [40]:
# stacking along axis-1
# In 3-D array, axis-1 means the number of rows.
# Stacking row wise
axis1_stack = np.stack((a,b,c),axis=1)
print(f"Stacking three 2-D arrays along axis-1:\n {axis1_stack}")

Stacking three 2-D arrays along axis-1:
 [[[ 1  2]
  [ 7  8]
  [13 14]]

 [[ 3  4]
  [ 9 10]
  [15 16]]

 [[ 5  6]
  [11 12]
  [17 18]]]


In [41]:
# stacking along axis-2
# in 3-D array axis-2 means the number of columns in every 2-D array.
# stacking column wise
axis2_stack = np.stack((a,b,c),axis=2)
print(f"Stacking three 2-D arrays along axis-2:\n {axis2_stack}")

Stacking three 2-D arrays along axis-2:
 [[[ 1  7 13]
  [ 2  8 14]]

 [[ 3  9 15]
  [ 4 10 16]]

 [[ 5 11 17]
  [ 6 12 18]]]


**Note:**<br>
• Reading of arrays row-wise<br>
• arraning is based on the newly created array axis

### Stacking Three 1-D array

In [42]:
a = np.arange(4)
b = np.arange(4,8)
c = np.arange(8,12)
print(f"array a :{a}")
print(f"array b :{b}")
print(f"array c :{c}")

# We will get 2-D array
# In 2-D array avaialble axes are: axis-0 and axis-1
# Based on axis-0:
# axis-0 in 2-D array means the number of rows
axis0_stack = np.stack((a,b,c),axis=0)
print(f"Stacking three 2-D arrays along axis-0:\n {axis0_stack}")

# Based on axis-1:
# axis-1 in 2-D array means the number of columns
axis1_stack = np.stack((a,b,c),axis=1)
print(f"Stacking three 2-D arrays along axis-1:\n {axis1_stack}")

array a :[0 1 2 3]
array b :[4 5 6 7]
array c :[ 8  9 10 11]
Stacking three 2-D arrays along axis-0:
 [[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]]
Stacking three 2-D arrays along axis-1:
 [[ 0  4  8]
 [ 1  5  9]
 [ 2  6 10]
 [ 3  7 11]]


## cancatenate() Vs stack()

<img src="concat vs stack.JPG" alt="concat vs stack" width="800" height="800">

## vstack()

• vstack--->vertical stack--->joining is always based on axis-0<br>
• For 1-D arrays--->2-D array as output.<br>
• For the remaining dimensions it acts as concatenate() along axis-0<br>

**Rules:**<br>
• The input arrays must have the same shape along all except first axis(axis-0)<br>
• 1-D arrays must have the same size.<br>
• The array formed by stacking the given arrays, will be at least 2-D.<br>
• vstack() operation is equivalent to concatenation along the first axis after 1-D arrays of shape (N,) have been reshaped to (1,N).<br>
• For 2-D or more dimension arrays, vstack() simply acts as concatenation wrt axis-0.<br>

In [None]:
import numpy as np
help(np.vstack)

### For 1-D arrays

In [47]:
# vstack for 1-D arrays of same sizes
a = np.array([10,20,30,40])
b = np.array([50,60,70,80])
# a will be converted to shapes (1,4) and b will be converted to (1,4)
np.vstack((a,b))

array([[10, 20, 30, 40],
       [50, 60, 70, 80]])

In [48]:
# vstack for 1-D arrays of different sizes
a = np.array([10,20,30,40])
b = np.array([50,60,70,80,90,100])
# a will be converted to shapes (1,4) and b will be converted to (1,6)
np.vstack((a,b))

ValueError: all the input array dimensions except for the concatenation axis must match exactly, but along dimension 1, the array at index 0 has size 4 and the array at index 1 has size 6

### For 2-D arrays

In [49]:
# vstack of 2-D arrays. sizes of axis-1 should be same to perform vstack() -> concatenation rule
# Here it is possible because sizes of axis-1 are same 3 and 3
# vstack() performed always along axis-0
a = np.arange(1,10).reshape(3,3)
b = np.arange(10,16).reshape(2,3)
np.vstack((a,b))

array([[ 1,  2,  3],
       [ 4,  5,  6],
       [ 7,  8,  9],
       [10, 11, 12],
       [13, 14, 15]])

In [51]:
# vstack of 2-D arrays. sizes of axis-1 should be same to perform vstack() -> concatenation rule
# Here it is not possible because sizes of axis-1 are same 3 and 2
# vstack() performed always along axis-0
a = np.arange(1,10).reshape(3,3)
b = np.arange(10,16).reshape(3,2)
np.vstack((a,b))

ValueError: all the input array dimensions except for the concatenation axis must match exactly, but along dimension 1, the array at index 0 has size 3 and the array at index 1 has size 2

### For 3-D arrays

• axis-0 means The number of 2-D arrays

In [52]:
a = np.arange(1,25).reshape(2,3,4)
b = np.arange(25,49).reshape(2,3,4)
print(f"array a : \n {a}")
print(f"array b : \n {b}")
result = np.vstack((a,b))
print(f"Result of vstack : \n {result}")

array a : 
 [[[ 1  2  3  4]
  [ 5  6  7  8]
  [ 9 10 11 12]]

 [[13 14 15 16]
  [17 18 19 20]
  [21 22 23 24]]]
array b : 
 [[[25 26 27 28]
  [29 30 31 32]
  [33 34 35 36]]

 [[37 38 39 40]
  [41 42 43 44]
  [45 46 47 48]]]
Result of vstack : 
 [[[ 1  2  3  4]
  [ 5  6  7  8]
  [ 9 10 11 12]]

 [[13 14 15 16]
  [17 18 19 20]
  [21 22 23 24]]

 [[25 26 27 28]
  [29 30 31 32]
  [33 34 35 36]]

 [[37 38 39 40]
  [41 42 43 44]
  [45 46 47 48]]]


## hstack()

• Exactly same as concatenate() but joining is always based on axis-1<br>
• hstack--->horizontal stack--->column wise<br>
• 1-D + 1-D --->1-D<br>

**Rules:**<br>
1. This is equivalent to concatenation along the second axis, except for 1-D arrays where it concatenates along the first axis.<br>
2. All input arrays must be same dimension.<br>
3. Except axis-1, all remining sizes must be equal.

In [None]:
import numpy as np
help(np.hstack)

### For 1-D arrays

In [53]:
a = np.array([10,20,30,40])
b = np.array([50,60,70,80,90,100])
np.hstack((a,b))

array([ 10,  20,  30,  40,  50,  60,  70,  80,  90, 100])

### For 2-D arrays

In [54]:
a = np.arange(1,7).reshape(3,2)
b = np.arange(7,16).reshape(3,3)
np.hstack((a,b))

array([[ 1,  2,  7,  8,  9],
       [ 3,  4, 10, 11, 12],
       [ 5,  6, 13, 14, 15]])

In [55]:
a = np.arange(1,7).reshape(2,3)
b = np.arange(7,16).reshape(3,3)
np.hstack((a,b))

ValueError: all the input array dimensions except for the concatenation axis must match exactly, but along dimension 0, the array at index 0 has size 2 and the array at index 1 has size 3

## dstack()

• dstack() --->depth/height stack --->concatenation based on axis-2<br>
• 1-D and 2-D arrays will be converted to 3-D array<br>
• The result is minimum 3-D array<br>

**Rules:**
1. This is equivalent to concatenation along the third axis after 2-D arrays of shape (M,N) have been reshaped to (M,N,1) and 1-D arrays of shape (N,) have been reshaped to (1,N,1).<br>
2. The arrays must have the same shape along all but the third axis. 1-D or 2-D arrays must have the same shape.<br>
3. The array formed by stacking the given arrays, will be at least 3-D.

In [None]:
import numpy as np
help(np.dstack)

In [56]:
a = np.array([1,2,3])
b = np.array([2,3,4])
np.dstack((a,b))

array([[[1, 2],
        [2, 3],
        [3, 4]]])

In [57]:
a = np.array([[1],[2],[3]])
b = np.array([[2],[3],[4]])
np.dstack((a,b))

array([[[1, 2]],

       [[2, 3]],

       [[3, 4]]])

## Summary of joining of nd arrays


| Function    | Description                                                       |
|-------------|-------------------------------------------------------------------|
| `concatenate()` | Join a sequence of arrays along an existing axis.                 |
| `stack()`       | Join a sequence of arrays along a new axis.                       |
| `vstack()`      | Stack arrays in sequence vertically according to the first axis (axis-0). |
| `hstack()`      | Stack arrays in sequence horizontally according to the second axis (axis-1). |
| `dstack()`      | Stack arrays in sequence depth-wise according to the third axis (axis-2).   |

# Chapter-11 - Splitting of arrays

**We can perform split operation on ndarrays using the following functions**<br>

1. split()<br>
2. vsplit()<br>
3. hsplit()<br>
4. dsplit()<br>
5. array_split()<br>

We will get only views, but not copies because the data is not going to be changed

## split()

### split(array, indices_or_sections, axis=0)

• Split an array into multiple sub-arrays of equal size.<br>
• sections means the number of sub-arrays<br>
• it returns list of ndarray objects.<br>
• all sections must be of equal size, otherwise error.<br>

In [None]:
import numpy as np
help(np.split)

### split() based on sections

• We can split arrays based on sections or indices.<br>
• If we split based on sections, the sizes of sub-arrays should be equal.<br>
• If we split based on indices, then the sizes of sub-arrays need not be the same

#### 1-D arrays (axis=0)

In [58]:
a = np.arange(1,10)
sub_arrays = np.split(a,3)
print(f"array a : {a}")
print(f"Type of sub_arrays :{type(sub_arrays)}")
print(f"sub_arrays : {sub_arrays}")

array a : [1 2 3 4 5 6 7 8 9]
Type of sub_arrays :<class 'list'>
sub_arrays : [array([1, 2, 3]), array([4, 5, 6]), array([7, 8, 9])]


In [60]:
# If dividing array into equal number of specified sections is not possible, then we will get error.
np.split(a,4)

ValueError: array split does not result in an equal division

#### 2-D arrays (axis=0 ==> Vertical split)

• splitting is based on axis-0 bydefault. ie row wise split(vertical split)<br>
• We can also split based on axis-1. column wise split (horizontal split)

In [61]:
# split based on default axis i.e., axis-0 (Vertical Split)
a = np.arange(1,25).reshape(6,4)
result_3sections = np.split(a,3) # dividing 3 sections vertically
print(f"array a : \n {a}")
print(f"splitting the array into 3 sections along axis-0 : \n {result_3sections}")
# Note: Here we can use various possible sections: 2,3,6

array a : 
 [[ 1  2  3  4]
 [ 5  6  7  8]
 [ 9 10 11 12]
 [13 14 15 16]
 [17 18 19 20]
 [21 22 23 24]]
splitting the array into 3 sections along axis-0 : 
 [array([[1, 2, 3, 4],
       [5, 6, 7, 8]]), array([[ 9, 10, 11, 12],
       [13, 14, 15, 16]]), array([[17, 18, 19, 20],
       [21, 22, 23, 24]])]


In [62]:
# Note: Here we can use various possible sections: 2,3,6
result_2sections = np.split(a,2,axis=0)
result_6sections = np.split(a,6,axis=0)
print(f"splitting the array into 2 sections along axis-0 : \n {result_2sections}")
print(f"splitting the array into 6 sections along axis-0 : \n {result_6sections}")

splitting the array into 2 sections along axis-0 : 
 [array([[ 1,  2,  3,  4],
       [ 5,  6,  7,  8],
       [ 9, 10, 11, 12]]), array([[13, 14, 15, 16],
       [17, 18, 19, 20],
       [21, 22, 23, 24]])]
splitting the array into 6 sections along axis-0 : 
 [array([[1, 2, 3, 4]]), array([[5, 6, 7, 8]]), array([[ 9, 10, 11, 12]]), array([[13, 14, 15, 16]]), array([[17, 18, 19, 20]]), array([[21, 22, 23, 24]])]


#### 2-D arrays (axis=1 ==> Horizontal split)

In [63]:
# split based on axis-1 (horizontal split)
# for the shape(6,4) we can perform 4 or 2 sections
a = np.arange(1,25).reshape(6,4)
result_2sections = np.split(a,2,axis=1) # dividing 2 sections horizontally
result_4sections = np.split(a,4,axis=1) # dividing 4 sections horizontally
print(f"splitting the array into 2 sections along axis-0 : \n {result_2sections}")
print(f"splitting the array into 4 sections along axis-0 : \n {result_4sections}")

splitting the array into 2 sections along axis-0 : 
 [array([[ 1,  2],
       [ 5,  6],
       [ 9, 10],
       [13, 14],
       [17, 18],
       [21, 22]]), array([[ 3,  4],
       [ 7,  8],
       [11, 12],
       [15, 16],
       [19, 20],
       [23, 24]])]
splitting the array into 4 sections along axis-0 : 
 [array([[ 1],
       [ 5],
       [ 9],
       [13],
       [17],
       [21]]), array([[ 2],
       [ 6],
       [10],
       [14],
       [18],
       [22]]), array([[ 3],
       [ 7],
       [11],
       [15],
       [19],
       [23]]), array([[ 4],
       [ 8],
       [12],
       [16],
       [20],
       [24]])]


### split() based on indices

• We can also split based on indices. The sizes of sub-arrays are need not be equal.

#### 1-D arrays(axis=0)

<img src="split_1.JPG" alt="split1" width="700" height="700">

In [65]:
# splitting the 1-D array based on indices
a = np.arange(10,101,10)
result = np.split(a,[3,7])
print(f"array a : {a}")
print(f"splitting the 1-D array based on indices : \n {result}")

array a : [ 10  20  30  40  50  60  70  80  90 100]
splitting the 1-D array based on indices : 
 [array([10, 20, 30]), array([40, 50, 60, 70]), array([ 80,  90, 100])]


In [67]:
# splitting the 1-D array based on indices
a = np.arange(10,101,10)
result = np.split(a,[2,5,7])
# [2,5,7] ==> 4 subarrays
# subarray-1 : before index-2 ==> 0,1
# subarray-2 : from index-2 to before index-5 ==> 2,3,4
# subarray-3 : from index-5 to before index-7 ==> 5,6
# subarray-4 : from index-7 to last index ==> 7,8,9
print(f"array a : {a}")
print(f"splitting the 1-D array based on indices : \n {result}")

array a : [ 10  20  30  40  50  60  70  80  90 100]
splitting the 1-D array based on indices : 
 [array([10, 20]), array([30, 40, 50]), array([60, 70]), array([ 80,  90, 100])]


#### 2-D arrays(axis=0)

<img src="split_2.JPG" alt="split1" width="500" height="600">

In [69]:
# splitting 2-D arrays based on indices along axis=0
a = np.arange(1,13).reshape(6,2)
result = np.split(a,[3,4])
print(f"array a : \n {a}")
print(f"resultant array after vertical split : \n {result}")

array a : 
 [[ 1  2]
 [ 3  4]
 [ 5  6]
 [ 7  8]
 [ 9 10]
 [11 12]]
resultant array after vertical split : 
 [array([[1, 2],
       [3, 4],
       [5, 6]]), array([[7, 8]]), array([[ 9, 10],
       [11, 12]])]


#### 2-D arrays(axis=1)

<img src="split_3.JPG" alt="split3" width="600" height="600">

In [82]:
a = np.arange(1,19).reshape(3,6)
result = np.split(a,[1,3,5],axis=1)
print(f"array a : \n {a}")
print(f"resultant array after horizontal split : \n {result}")

array a : 
 [[ 1  2  3  4  5  6]
 [ 7  8  9 10 11 12]
 [13 14 15 16 17 18]]
resultant array after horizontal split : 
 [array([[ 1],
       [ 7],
       [13]]), array([[ 2,  3],
       [ 8,  9],
       [14, 15]]), array([[ 4,  5],
       [10, 11],
       [16, 17]]), array([[ 6],
       [12],
       [18]])]


In [83]:
a = np.arange(1,19).reshape(3,6)
result = np.split(a,[2,4,4],axis=1)
# [2,4,4] => 4 subarrays are created
# subarray1: 2 ==> before index-2 ==> 0,1
# subarray2: 4 ==> from index-2 to before index-4 ==> 2,3
# subarray3: 4 ==> from index-4 to before index-4 ==> empty array
# subarray4: ==> from index-4 to last index ==> 4,5
print(f"array a : \n {a}")
print(f"resultant array after horizontal split : \n {result}")
print(f"first subarray : \n {result[0]}")
print(f"second subarray : \n {result[1]}")
print(f"third subarray : \n {result[2]}")
print(f"fourth subarray : \n {result[3]}")

array a : 
 [[ 1  2  3  4  5  6]
 [ 7  8  9 10 11 12]
 [13 14 15 16 17 18]]
resultant array after horizontal split : 
 [array([[ 1,  2],
       [ 7,  8],
       [13, 14]]), array([[ 3,  4],
       [ 9, 10],
       [15, 16]]), array([], shape=(3, 0), dtype=int32), array([[ 5,  6],
       [11, 12],
       [17, 18]])]
first subarray : 
 [[ 1  2]
 [ 7  8]
 [13 14]]
second subarray : 
 [[ 3  4]
 [ 9 10]
 [15 16]]
third subarray : 
 []
fourth subarray : 
 [[ 5  6]
 [11 12]
 [17 18]]


In [84]:
a = np.arange(1,19).reshape(3,6)
result = np.split(a,[0,2,6],axis=1)
# [0,2,6] => 4 subarrays are created
# subarray1: 0 ==> before index-0 ==> empty
# subarray2: 2 ==> from index-0 to before index-2 ==> 0,1
# subarray3: 6 ==> from index-2 to before index-6 ==> 2,3,4,5
# subarray4: ==> from index-6 to last index ==> empty
print(f"array a : \n {a}")
print(f"resultant array after horizontal split : \n {result}")
print(f"first subarray : \n {result[0]}")
print(f"second subarray : \n {result[1]}")
print(f"third subarray : \n {result[2]}")
print(f"fourth subarray : \n {result[3]}")

array a : 
 [[ 1  2  3  4  5  6]
 [ 7  8  9 10 11 12]
 [13 14 15 16 17 18]]
resultant array after horizontal split : 
 [array([], shape=(3, 0), dtype=int32), array([[ 1,  2],
       [ 7,  8],
       [13, 14]]), array([[ 3,  4,  5,  6],
       [ 9, 10, 11, 12],
       [15, 16, 17, 18]]), array([], shape=(3, 0), dtype=int32)]
first subarray : 
 []
second subarray : 
 [[ 1  2]
 [ 7  8]
 [13 14]]
third subarray : 
 [[ 3  4  5  6]
 [ 9 10 11 12]
 [15 16 17 18]]
fourth subarray : 
 []


In [85]:
a = np.arange(1,19).reshape(3,6)
result = np.split(a,[1,5,3],axis=1)
# [1,5,3] => 4 subarrays are created
# subarray1: 1 ==> before index-1 ==> 0
# subarray2: 5 ==> from index-1 to before index-5 ==> 1,2,3,4
# subarray3: 3 ==> from index-5 to before index-3 ==> empty
# subarray4: ==> from index-3 to last index ==> 3,4,5
print(f"array a : \n {a}")
print(f"resultant array after horizontal split : \n {result}")
print(f"first subarray : \n {result[0]}")
print(f"second subarray : \n {result[1]}")
print(f"third subarray : \n {result[2]}")
print(f"fourth subarray : \n {result[3]}")

array a : 
 [[ 1  2  3  4  5  6]
 [ 7  8  9 10 11 12]
 [13 14 15 16 17 18]]
resultant array after horizontal split : 
 [array([[ 1],
       [ 7],
       [13]]), array([[ 2,  3,  4,  5],
       [ 8,  9, 10, 11],
       [14, 15, 16, 17]]), array([], shape=(3, 0), dtype=int32), array([[ 4,  5,  6],
       [10, 11, 12],
       [16, 17, 18]])]
first subarray : 
 [[ 1]
 [ 7]
 [13]]
second subarray : 
 [[ 2  3  4  5]
 [ 8  9 10 11]
 [14 15 16 17]]
third subarray : 
 []
fourth subarray : 
 [[ 4  5  6]
 [10 11 12]
 [16 17 18]]


## vsplit()

• vsplit means vertical split means row wise split<br>
• split is based on axis-0

In [93]:
import numpy as np
help(np.vsplit)

Help on function vsplit in module numpy:

vsplit(ary, indices_or_sections)
    Split an array into multiple sub-arrays vertically (row-wise).
    
    Please refer to the ``split`` documentation.  ``vsplit`` is equivalent
    to ``split`` with `axis=0` (default), the array is always split along the
    first axis regardless of the array dimension.
    
    See Also
    --------
    split : Split an array into multiple sub-arrays of equal size.
    
    Examples
    --------
    >>> x = np.arange(16.0).reshape(4, 4)
    >>> x
    array([[ 0.,   1.,   2.,   3.],
           [ 4.,   5.,   6.,   7.],
           [ 8.,   9.,  10.,  11.],
           [12.,  13.,  14.,  15.]])
    >>> np.vsplit(x, 2)
    [array([[0., 1., 2., 3.],
           [4., 5., 6., 7.]]), array([[ 8.,  9., 10., 11.],
           [12., 13., 14., 15.]])]
    >>> np.vsplit(x, np.array([3, 6]))
    [array([[ 0.,  1.,  2.,  3.],
           [ 4.,  5.,  6.,  7.],
           [ 8.,  9., 10., 11.]]), array([[12., 13., 14., 15.]]), arr

### 1-D arrays

• To use vsplit, input array should be atleast 2-D array<br>
• It is not possible to split 1-D array vertically.

In [94]:
a = np.arange(10)
np.vsplit(a,2)

ValueError: vsplit only works on arrays of 2 or more dimensions

### 2-D arrays

In [96]:
a = np.arange(1,13).reshape(6,2)
print(f'array is: {a}')
np.vsplit(a,2)

array is: [[ 1  2]
 [ 3  4]
 [ 5  6]
 [ 7  8]
 [ 9 10]
 [11 12]]


[array([[1, 2],
        [3, 4],
        [5, 6]]),
 array([[ 7,  8],
        [ 9, 10],
        [11, 12]])]

In [97]:
np.vsplit(a,3)

[array([[1, 2],
        [3, 4]]),
 array([[5, 6],
        [7, 8]]),
 array([[ 9, 10],
        [11, 12]])]

In [98]:
np.vsplit(a,6)

[array([[1, 2]]),
 array([[3, 4]]),
 array([[5, 6]]),
 array([[7, 8]]),
 array([[ 9, 10]]),
 array([[11, 12]])]

In [90]:
# vsplit() based on indices:
a = np.arange(1,13).reshape(6,2)
np.vsplit(a,[3,4])

[array([[1, 2],
        [3, 4],
        [5, 6]]),
 array([[7, 8]]),
 array([[ 9, 10],
        [11, 12]])]

## hsplit()

• hsplit--->means horizontal split(column wise)<br>
• split will be happend based on 2nd axis (axis-1)

In [None]:
import numpy as np
help(np.hsplit)

### 1-D arrays

In [91]:
a = np.arange(10)
np.hsplit(a,2)

[array([0, 1, 2, 3, 4]), array([5, 6, 7, 8, 9])]

## 2-D arrays

• Based on axis-1 only

In [92]:
a = np.arange(1,13).reshape(3,4)
np.hsplit(a,2)

[array([[ 1,  2],
        [ 5,  6],
        [ 9, 10]]),
 array([[ 3,  4],
        [ 7,  8],
        [11, 12]])]

In [99]:
# hsplit() based on indices:
a = np.arange(10,101,10)
np.hsplit(a,[2,4])

[array([10, 20]), array([30, 40]), array([ 50,  60,  70,  80,  90, 100])]

In [100]:
a = np.arange(24).reshape(4,6)
np.hsplit(a,[2,4])

[array([[ 0,  1],
        [ 6,  7],
        [12, 13],
        [18, 19]]),
 array([[ 2,  3],
        [ 8,  9],
        [14, 15],
        [20, 21]]),
 array([[ 4,  5],
        [10, 11],
        [16, 17],
        [22, 23]])]

## dsplit()

• dsplit --->means depth split<br>
• splitting based on 3rd axis(axis-2)

In [None]:
import numpy as np
help(np.dsplit)

In [101]:
a = np.arange(24).reshape(2,3,4)
a

array([[[ 0,  1,  2,  3],
        [ 4,  5,  6,  7],
        [ 8,  9, 10, 11]],

       [[12, 13, 14, 15],
        [16, 17, 18, 19],
        [20, 21, 22, 23]]])

In [102]:
# dsplit based on sections
np.dsplit(a,2)

[array([[[ 0,  1],
         [ 4,  5],
         [ 8,  9]],
 
        [[12, 13],
         [16, 17],
         [20, 21]]]),
 array([[[ 2,  3],
         [ 6,  7],
         [10, 11]],
 
        [[14, 15],
         [18, 19],
         [22, 23]]])]

In [103]:
# dsplit based on indices
np.dsplit(a,[1,3])

[array([[[ 0],
         [ 4],
         [ 8]],
 
        [[12],
         [16],
         [20]]]),
 array([[[ 1,  2],
         [ 5,  6],
         [ 9, 10]],
 
        [[13, 14],
         [17, 18],
         [21, 22]]]),
 array([[[ 3],
         [ 7],
         [11]],
 
        [[15],
         [19],
         [23]]])]

## array_split()

• In the case of **split()** with sections, the array should be splitted into equal parts. If equal parts are not possible, then we will get error.<br>
• But in the case of **array_split()** we won't get any error.<br>

1. The only difference between split() and array_split() is that 'array_split' allows 'indices_or_sections' to be an integer that does not equally divide the axis.
2. For an array of length x that should be split into n sections, it returns x % n subarrays of size x//n + 1 and the rest of size x//n

In [None]:
import numpy as np
help(np.array_split)

In [104]:
# x % n sub-arrays of size x//n + 1
# and the rest of size x//n
# eg-1:
# 10 elements --->3 sections
# 10%3(1) sub-arrays of size 10//3+1(4)
# and the rest(2) of size 10//3 (3)
# 1 sub-array of size 4 and the rest of size 3
#(4,3,3)
a = np.arange(10,101,10)
np.array_split(a,3)

[array([10, 20, 30, 40]), array([50, 60, 70]), array([ 80,  90, 100])]

In [105]:
# Eg: 2
# 11 elements 3 sections
# it returns x % n (11%3=2)sub-arrays of size x//n + 1(11//3+1=4)
# and the rest(1) of size x//n.(11//3=3)
# (4,4,3)
a = np.arange(11)
np.array_split(a,3)

[array([0, 1, 2, 3]), array([4, 5, 6, 7]), array([ 8,  9, 10])]

In [106]:
# 2-D array
# x=6 n=4
# x % n sub-arrays of size x//n + 1-->2 sub-arrays of size:2
# rest of size x//n.--->2 sub-arrays of size:1
# 2,2,1,1,
a = np.arange(24).reshape(6,4)
a

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15],
       [16, 17, 18, 19],
       [20, 21, 22, 23]])

In [107]:
np.array_split(a,4)

[array([[0, 1, 2, 3],
        [4, 5, 6, 7]]),
 array([[ 8,  9, 10, 11],
        [12, 13, 14, 15]]),
 array([[16, 17, 18, 19]]),
 array([[20, 21, 22, 23]])]

## summary of split methods

| Method       | Description                                                                                     |
|--------------|-------------------------------------------------------------------------------------------------|
| `split()`    | Split an array into multiple sub-arrays of equal size. Raises error if equal division cannot be made. |
| `vsplit()`   | Split array into multiple sub-arrays vertically (row wise).                                     |
| `hsplit()`   | Split array into multiple sub-arrays horizontally (column-wise).                                |
| `dsplit()`   | Split array into multiple sub-arrays along the 3rd axis (depth).                                |
| `array_split()` | Split an array into multiple sub-arrays of equal or near-equal size. Does not raise an exception if an equal division cannot be made. |


# Chapter-12 - Sorting elements of nd arrays

Sorting elements of nd arrays<br>

• We can sort elements of nd array.<br>
• numpy module contains sort() function.<br>
• The default sorting algorithm is quicksort and it is Ascending order<br>
• We can also specify mergesort, heapsort etc<br>
• For numbers-->Ascending order<br>
• For Strings-->alphabetical order

In [None]:
import numpy as np
help(np.sort)

## 1-D arrays

In [108]:
# 1-D arrays
a = np.array([70,20,60,10,50,40,30])
sorted_array = np.sort(a)
print(f"Original array a : {a}")
print(f"Sorted array(Ascending by default) : {sorted_array}")

Original array a : [70 20 60 10 50 40 30]
Sorted array(Ascending by default) : [10 20 30 40 50 60 70]


In [109]:
# Descending the 1-D arrays
# 1st way :: np.sort(a)[::-1]
a = np.array([70,20,60,10,50,40,30])
sorted_array = np.sort(a)[::-1]
print(f"Original array a : {a}")
print(f"Sorted array(Descending) : {sorted_array}")

Original array a : [70 20 60 10 50 40 30]
Sorted array(Descending) : [70 60 50 40 30 20 10]


In [110]:
# Descending the 1-D arrays
# 2nd way :: -np.sort(-a)
a = np.array([70,20,60,10,50,40,30])
-a

array([-70, -20, -60, -10, -50, -40, -30])

In [111]:
np.sort(-a) # Ascending order

array([-70, -60, -50, -40, -30, -20, -10])

In [112]:
-np.sort(-a) # Descending Order

array([70, 60, 50, 40, 30, 20, 10])

In [113]:
# To sort string elements in alphabetical order
a = np.array(['cat','rat','bat','vat','dog'])
ascending = np.sort(a)
descending = np.sort(a)[::-1]
# -np.sort(-a) ==> 2nd way is not possible
print(f"Original array a : {a}")
print(f"Sorted array(Ascending) : {ascending}")
print(f"Sorted array(Descending) : {descending}")

Original array a : ['cat' 'rat' 'bat' 'vat' 'dog']
Sorted array(Ascending) : ['bat' 'cat' 'dog' 'rat' 'vat']
Sorted array(Descending) : ['vat' 'rat' 'dog' 'cat' 'bat']


## 2-D arrays

• axis-0 --->the number of rows (axis = -2)<br>
• axis-1--->the number of columns (axis= -1) ==> defautl value<br>
• sorting is based on columns in 2-D arrays by default. Every 1-D array will be sorted.

In [114]:
a= np.array([[40,20,70],[30,20,60],[70,90,80]])
ascending = np.sort(a)
print(f"Original array a :\n {a}")
print(f"Sorted array(Ascending) : \n {ascending}")

Original array a :
 [[40 20 70]
 [30 20 60]
 [70 90 80]]
Sorted array(Ascending) : 
 [[20 40 70]
 [20 30 60]
 [70 80 90]]


### order parameter
• Use the order keyword to specify a field to use when sorting a structured array:

In [115]:
# creating the structured array
dtype = [('name', 'S10'), ('height', float), ('age', int)]
values = [('Gopie', 1.7, 45), ('Vikranth', 1.5, 38),('Sathwik', 1.8, 28)]
a = np.array(values, dtype=dtype)
sort_height = np.sort(a, order='height')
sort_age = np.sort(a,order='age')
print(f"Original Array :\n {a}")
print(f"Sorting based on height :\n {sort_height}")
print(f"Sorting based on age :\n {sort_age}")

Original Array :
 [(b'Gopie', 1.7, 45) (b'Vikranth', 1.5, 38) (b'Sathwik', 1.8, 28)]
Sorting based on height :
 [(b'Vikranth', 1.5, 38) (b'Gopie', 1.7, 45) (b'Sathwik', 1.8, 28)]
Sorting based on age :
 [(b'Sathwik', 1.8, 28) (b'Vikranth', 1.5, 38) (b'Gopie', 1.7, 45)]


In [116]:
# Sort by age, then height if ages are equal
dtype = [('name', 'S10'), ('height', float), ('age', int)]
values = [('Gopie', 1.7, 45), ('Vikranth', 1.5, 38),('Sathwik', 1.8, 28),('Rudra', 1.5, 28)]
a = np.array(values, dtype=dtype)
sort_age_height = np.sort(a, order=['age', 'height'])
print(f"Original Array :\n {a}")
print(f"Sorting based on height :\n {sort_age_height}")

Original Array :
 [(b'Gopie', 1.7, 45) (b'Vikranth', 1.5, 38) (b'Sathwik', 1.8, 28)
 (b'Rudra', 1.5, 28)]
Sorting based on height :
 [(b'Rudra', 1.5, 28) (b'Sathwik', 1.8, 28) (b'Vikranth', 1.5, 38)
 (b'Gopie', 1.7, 45)]


# Chapter-13 - Searching elements of ndarray

## Searching elements of ndarray

- **Function Used**: `where()`
  - Syntax: `where(condition, [x, y])`
- **Functionality**:
  - If only the condition is specified, it returns the indices of the elements that satisfy the condition.
  - If `condition`, `x`, `y` are provided:
    - Elements satisfying the condition are replaced with `x`.
    - Remaining elements are replaced with `y`.
- **Returns**:
  - Does not return the elements directly, only their indices.
- **Additional Info**:
  - Acts as a replacement operator.
  - Similar to a ternary operator in its replacement functionality.

In [None]:
import numpy as np
help(np.where)

## where() function

In [117]:
# Find indexes where the value is 7 from 1-D array
a = np.array([3,5,7,6,7,9,4,6,10,15])
b = np.where(a==7)
b # element 7 is available at 2 and 4 indices

(array([2, 4], dtype=int64),)

In [118]:
# Find indices where odd numbers present in the given 1-D array?
a = np.array([3,5,7,6,7,9,4,6,10,15])
b = np.where(a%2!=0)
b

(array([0, 1, 2, 4, 5, 9], dtype=int64),)

## Finding the elements directly

We can get the elements directly in 2 ways<br>
1. using where() function<br>
2. using condition based selection

### where() function

In [119]:
# to get the odd numbers
a = np.array([3,5,7,6,7,9,4,6,10,15])
indices = np.where(a%2!=0)
a[indices]

array([ 3,  5,  7,  7,  9, 15])

### conditional based selection

In [120]:
# to get the odd numbers
a = np.array([3,5,7,6,7,9,4,6,10,15])
a[a%2!=0]

array([ 3,  5,  7,  7,  9, 15])

In [123]:
# where(condition,[x,y])
# if condition satisfied that element will be replaced from x and
# if the condition fails that element will be replaced from y.
# Replace every even number with 8888 and every odd number with 7777?
a = np.array([3,5,7,6,7,9,4,6,10,15])
b = np.where( a%2 == 0, 8888, 7777)
b

array([7777, 7777, 7777, 8888, 7777, 7777, 8888, 8888, 8888, 7777])

In [124]:
# Find indexes where odd numbers present in the given 1-D array and replace with element 9999.
a = np.array([3,5,7,6,7,9,4,6,10,15])
b = np.where( a%2 != 0, 9999, a)
b

array([9999, 9999, 9999,    6, 9999, 9999,    4,    6,   10, 9999])

### We can use where() function for any n-dimensional array

#### 2-D arrays
• It will return the 2 arrays.<br>
• First array is row indices<br>
• Second array is column indices

In [125]:
# to find the indices of the elements where elements are divisible by 5
a = np.arange(12).reshape(4,3)
np.where(a%5==0)


#• The second array array([0, 2, 1] represents the column indices
#• The required elements present at (0,0),(1,2) and (3,1) index places.

(array([0, 1, 3], dtype=int64), array([0, 2, 1], dtype=int64))

In [126]:
# we can perform replacement on 2-D arrays
a = np.arange(12).reshape(4,3)
np.where(a%5==0,9999,a)

array([[9999,    1,    2],
       [   3,    4, 9999],
       [   6,    7,    8],
       [   9, 9999,   11]])

## searchsorted() function

• Internally this function will use **Binary Search algorithm**. Hence we can call this function **only for sorted arrays**.<br>
• If the array is not sorted then we will get abnormal results.<br>
• Complexicity of the Binary search algorithm is O(log n)<br>
• It will return insertion point(i.e., index) of the given element

In [None]:
import numpy as np
help(np.searchsorted)

In [127]:
# to find the insertion point of 6 from left
a = np.arange(0,31,5)
np.searchsorted(a,6)

2

**Note**<br>
• Bydefault it will always search from left hand side to identify insertion point.<br>
• If we want to search from right hand side we should use **side='right'**

In [128]:
# to find the insertion point of 6 from right
a = np.arange(0,31,5)
print(f"Array a :\n {a}")
np.searchsorted(a,6,side='right')

Array a :
 [ 0  5 10 15 20 25 30]


2

In [129]:
# to find the insetion point from left and right
a = np.array([3,5,7,6,7,9,4,10,15,6])
# first sort the elements
a = np.sort(a)
# insertion point from left(default)
left = np.searchsorted(a,6)
# insertion point from right
right = np.searchsorted(a,6,side='right')
print(f"The original array : {a}")
print(f"Insertion point for 6 from left : {left}")
print(f"Insertion point for 6 from right : {right}")

The original array : [ 3  4  5  6  6  7  7  9 10 15]
Insertion point for 6 from left : 3
Insertion point for 6 from right : 5


### Summary:

| Function        | Description                                      |
|-----------------|--------------------------------------------------|
| `sort()`        | To sort given array                              |
| `where()`       | To perform search and replace operation          |
| `searchsorted()`| To identify insertion point in the given sorted array |