# Array Manipulation

**Table of contents**<a id='toc0_'></a>    
- [<u>Shape manipulation](#toc1_)    
  - [Transpose of an array](#toc1_1_)    
  - [Reshaping](#toc1_2_)    
  - [Flattening](#toc1_3_)    
  - [Expanding](#toc1_4_)    
  - [Squeezing](#toc1_5_)    
- [<u> Array repetition](#toc2_)    
  - [Repeat an Array a given number of times](#toc2_1_)    
  - [Repeat elements of an array](#toc2_2_)    
- [<u>Broadcasting](#toc3_)    
- [<u> Joining Arrays](#toc4_)    
  - [Concatenation](#toc4_1_)    
  - [Stacking](#toc4_2_)    
  - [Horizontal (column-wise) stacking](#toc4_3_)    
  - [Vertical (row-wise) stacking](#toc4_4_)    
- [<u> Splitting Arrays](#toc5_)    
  - [Splitting an array into multiple sub-arrays of equal length](#toc5_1_)    
  - [Horizontal (column-wise) splitting](#toc5_2_)    
  - [Vertical (row-wise) splitting](#toc5_3_)    

<!-- vscode-jupyter-toc-config
	numbering=false
	anchor=true
	flat=false
	minLevel=2
	maxLevel=4
	/vscode-jupyter-toc-config -->
<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->

In [4]:
# import statements
import numpy as np
from numpy.random import default_rng

## <a id='toc1_'></a>[<u>Shape manipulation](#toc0_)

Some array operations such as, combining or joining two or more arrays or, mathematical operations, sometimes will not work at all or will work in an unexpected way in case of dimesion mismatch. This is why manipulating array dimensions is a very important operation. 

In [2]:
# random example array
rng = default_rng()
ary = rng.integers(low=5, high=20, size=(4, 3))

In [3]:
ary

array([[14, 12,  5],
       [19,  5, 16],
       [19,  9, 18],
       [11,  7, 13]])

### <a id='toc1_1_'></a>[Transpose of an array](#toc0_)

`-->` If an 'n' dimensional array has a shape of <i><b>(i[0], i[1], ... i[n-2], i[n-1])</b></i> , its transpose will have the shape of <i><b>(i[n-1], i[n-2], ... i[1], i[0])</b>. 

- For a 1-D array this has no effect, as a transposed vector is simply the same vector. 
- For a 2-D array, this is a standard matrix transpose.


`-->` To find the transpose of an array we can use the <i>`numpy.transpose(a, axes=None)`</i>. This will Return a view of the array with axes transposed.

- axes=None: reverses the order of the axes

- axes=tuple of ints: a tuple of (i, j) means that, the i-th axis of 'a' will become the j-th axis of the transpose array

In [4]:
np.transpose(ary)  # or simply, ary.transpose()

array([[14, 19, 19, 11],
       [12,  5,  9,  7],
       [ 5, 16, 18, 13]])

### <a id='toc1_2_'></a>[Reshaping](#toc0_)

In [5]:
# np.reshape(a, newshape)
# Gives a new shape to the array "a" without changing its data.
# doesn't change in-place
np.reshape(ary, (2, 6))

array([[14, 12,  5, 19,  5, 16],
       [19,  9, 18, 11,  7, 13]])

### <a id='toc1_3_'></a>[Flattening](#toc0_)

In [6]:
# .flatten() will return a copy of the original array
ary.flatten()

array([14, 12,  5, 19,  5, 16, 19,  9, 18, 11,  7, 13])

In [7]:
# .ravel() will show only a manipulated view
ary.ravel()

array([14, 12,  5, 19,  5, 16, 19,  9, 18, 11,  7, 13])

### <a id='toc1_4_'></a>[Expanding](#toc0_)

In [8]:
# example array
a = np.arange(1, 10).reshape(3, 3)

In [9]:
a

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

In [10]:
# Expand the shape of an array by inserting a new axis that will appear at the `axis` position in the expanded array shape.
exp_a = np.expand_dims(a, axis=1)

In [11]:
exp_a

array([[[1, 2, 3]],

       [[4, 5, 6]],

       [[7, 8, 9]]])

In [12]:
exp_a.shape

(3, 1, 3)

### <a id='toc1_5_'></a>[Squeezing](#toc0_)

In [13]:
# np.squeeze(x, axis=None)
# the opposite of expanding
# Removes all the axes of length one from `a`
# axis=x will remove only that particular axis but, it must have a length of 1
np.squeeze(exp_a, axis=1)

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

## <a id='toc2_'></a>[<u> Array repetition](#toc0_)

In [14]:
# example array
rp_ary = np.linspace(1, 9, 9).reshape(3, 3)

In [15]:
rp_ary

array([[1., 2., 3.],
       [4., 5., 6.],
       [7., 8., 9.]])

### <a id='toc2_1_'></a>[Repeat an Array a given number of times](#toc0_)

In [16]:
# numpy.tile(ary, reps)
# for reps, if we pass in a tuple, for example (4, 1) will result in repeating the array
# 4 times along the y axix and only once along the x axis (essentially no repetition)
np.tile(rp_ary, (2, 2))

array([[1., 2., 3., 1., 2., 3.],
       [4., 5., 6., 4., 5., 6.],
       [7., 8., 9., 7., 8., 9.],
       [1., 2., 3., 1., 2., 3.],
       [4., 5., 6., 4., 5., 6.],
       [7., 8., 9., 7., 8., 9.]])

### <a id='toc2_2_'></a>[Repeat elements of an array](#toc0_)

In [17]:
# numpy.repeat(ary, reps, axis=None)
# axis=None will repeat the array elements in a flattened version of the array
np.repeat(rp_ary, 2, axis=0)

array([[1., 2., 3.],
       [1., 2., 3.],
       [4., 5., 6.],
       [4., 5., 6.],
       [7., 8., 9.],
       [7., 8., 9.]])

## <a id='toc3_'></a>[<u>Broadcasting](#toc0_)

Broadcasting is a mechanism that automatically matches arrays with different shapes (by temporarily converting shape of one to match the other) for element-wise operations.

Subject to certain constraints, the smaller array is “broadcasted” across the larger array so that they have compatible shapes. Broadcasting provides a means of vectorizing array operations thus increasing the speed of execution.

`Note:` Not all arrays can be broadcasted. They must meet certain conditions.

When operating on two arrays, NumPy compares their shapes element-wise. It starts with the trailing (i.e. rightmost) dimension and works its way left. Two dimensions are compatible when,
- they are equal, or
- one of them is 1.

If these conditions are not met, a ValueError: operands could not be broadcast together exception is thrown, indicating that the arrays have incompatible shapes.

A set of arrays is called “broadcastable” to the same shape if the above rules produce a valid result. For example, the following is a valid broadcasting operation. <img src=Broadcasting.png>

In [18]:
a = np.arange(1, 5).reshape(2, 2)
b = np.arange(6, 9).reshape(3, 1)
c = 5

- **Example of Valid broadcasting**

In [19]:
a + c

array([[6, 7],
       [8, 9]])

- **Example of Invalid broadcasting**

In [20]:
try:
    a + b
except ValueError as err_msg:
    print(err_msg)

operands could not be broadcast together with shapes (2,2) (3,1) 


**Note:** Operations such as repeating or tiling can be used to make two arrays compatible for element wise operation but it's not recommended unless absolutely necessary.

## <a id='toc4_'></a>[<u> Joining Arrays](#toc0_)

### <a id='toc4_1_'></a>[Concatenation](#toc0_)

`np.concatenate((a1, a2, ...), axis=0, out=None, dtype=None, casting="same_kind")`

In [21]:
# example arrays
con_ary1 = np.linspace(1, 9, 9).reshape(3, 3)
con_ary2 = np.linspace(10, 12, 3).reshape(1, 3)

- Row wise concatenation (to concatenate along axis=0, axix=1 must be equal in all of the arrays to be concatenated)

In [22]:
np.concatenate((con_ary1, con_ary2), axis=0)

array([[ 1.,  2.,  3.],
       [ 4.,  5.,  6.],
       [ 7.,  8.,  9.],
       [10., 11., 12.]])

- Column wise concatenation (to concatenate along axis=1, axis=0 must be equal in all of the arrays to be concatenated)

In [23]:
np.concatenate((con_ary1, con_ary2.T), axis=1)

array([[ 1.,  2.,  3., 10.],
       [ 4.,  5.,  6., 11.],
       [ 7.,  8.,  9., 12.]])

### <a id='toc4_2_'></a>[Stacking](#toc0_)

To stack arrays together, all the arrays must have the same dimensions. And note that, this will add a new dimension to the stacked array. 

We use the `numpy.stack((tuple of arrays), axis=0)` function to perform the stacking operation. The axis parameter specifies the index of the new axis in the dimensions of the result.

In [24]:
stk_ary1 = np.linspace(1, 9, 9).reshape(3, 3)
stk_ary2 = np.linspace(10, 18, 9).reshape(3, 3)

In [25]:
np.stack((stk_ary1, stk_ary2), axis=1)

array([[[ 1.,  2.,  3.],
        [10., 11., 12.]],

       [[ 4.,  5.,  6.],
        [13., 14., 15.]],

       [[ 7.,  8.,  9.],
        [16., 17., 18.]]])

In [26]:
np.stack((stk_ary1, stk_ary2), axis=1).shape

(3, 2, 3)

### <a id='toc4_3_'></a>[Horizontal (column-wise) stacking](#toc0_)

This works differently than simple stacking.

`np.hstack()` will add the arrays in a column wise fashion i.e, similar to **concatenating along axis=1**.

In [27]:
# example arrays
hstk_ary1 = np.linspace(1, 9, 9).reshape(3, 3)
hstk_ary2 = np.linspace(10, 12, 3).reshape(1, 3)

In [28]:
np.hstack((hstk_ary1, hstk_ary2.T))

array([[ 1.,  2.,  3., 10.],
       [ 4.,  5.,  6., 11.],
       [ 7.,  8.,  9., 12.]])

### <a id='toc4_4_'></a>[Vertical (row-wise) stacking](#toc0_)

`np.vstack()` will add the arrays in a row wise fashion i.e, similar to concatenating along axis=0.

In [29]:
# example arrays
vstk_ary1 = np.linspace(1, 9, 9).reshape(3, 3)
vstk_ary2 = np.linspace(10, 12, 3).reshape(1, 3)

In [30]:
np.vstack((vstk_ary1, vstk_ary2))

array([[ 1.,  2.,  3.],
       [ 4.,  5.,  6.],
       [ 7.,  8.,  9.],
       [10., 11., 12.]])

## <a id='toc5_'></a>[<u> Splitting Arrays](#toc0_)

In [31]:
# example array
sp_ary = np.linspace(0, 30, 12).reshape(4, 3)

### <a id='toc5_1_'></a>[Splitting an array into multiple sub-arrays of equal length](#toc0_)

- **numpy.split(ary, sections, axis=0)**

If the size of the array along the specified axis is not divisible by the number of sections specified, it will throw an error.

In [32]:
np.split(sp_ary, 2)

[array([[ 0.        ,  2.72727273,  5.45454545],
        [ 8.18181818, 10.90909091, 13.63636364]]),
 array([[16.36363636, 19.09090909, 21.81818182],
        [24.54545455, 27.27272727, 30.        ]])]

- **numpy.split_array(ary, sections, axis=0)**

    - If there are more elements in the array along the defined axis after splitting then, extra elements will be discarded.
    - If there are not enough elements in the array along the defined axis then, an empty axis will be generated in some of the splitted arrays.

In [33]:
np.array_split(sp_ary, 4, axis=1)

[array([[ 0.        ],
        [ 8.18181818],
        [16.36363636],
        [24.54545455]]),
 array([[ 2.72727273],
        [10.90909091],
        [19.09090909],
        [27.27272727]]),
 array([[ 5.45454545],
        [13.63636364],
        [21.81818182],
        [30.        ]]),
 array([], shape=(4, 0), dtype=float64)]

### <a id='toc5_2_'></a>[Horizontal (column-wise) splitting](#toc0_)

**Note:** It is a must that, No of sections == No of columns

In [34]:
# numpy.hsplit(ary, sections)
np.hsplit(sp_ary, 3)

[array([[ 0.        ],
        [ 8.18181818],
        [16.36363636],
        [24.54545455]]),
 array([[ 2.72727273],
        [10.90909091],
        [19.09090909],
        [27.27272727]]),
 array([[ 5.45454545],
        [13.63636364],
        [21.81818182],
        [30.        ]])]

### <a id='toc5_3_'></a>[Vertical (row-wise) splitting](#toc0_)

**Note:** It is a must that, No of sections == No of rows

In [35]:
# numpy.vsplit(ary, sections)
np.vsplit(sp_ary, 4)

[array([[0.        , 2.72727273, 5.45454545]]),
 array([[ 8.18181818, 10.90909091, 13.63636364]]),
 array([[16.36363636, 19.09090909, 21.81818182]]),
 array([[24.54545455, 27.27272727, 30.        ]])]

In [2]:
import numpy as np

In [3]:
np.info(np.insert)

 insert(arr, obj, values, axis=None)

Insert values along the given axis before the given indices.

Parameters
----------
arr : array_like
    Input array.
obj : int, slice or sequence of ints
    Object that defines the index or indices before which `values` is
    inserted.

    .. versionadded:: 1.8.0

    Support for multiple insertions when `obj` is a single scalar or a
    sequence with one element (similar to calling insert multiple
    times).
values : array_like
    Values to insert into `arr`. If the type of `values` is different
    from that of `arr`, `values` is converted to the type of `arr`.
    `values` should be shaped so that ``arr[...,obj,...] = values``
    is legal.
axis : int, optional
    Axis along which to insert `values`.  If `axis` is None then `arr`
    is flattened first.

Returns
-------
out : ndarray
    A copy of `arr` with `values` inserted.  Note that `insert`
    does not occur in-place: a new array is returned. If
    `axis` is None, `out` is a flatt