<a href="https://colab.research.google.com/github/ayanbabusona/Intro_to_ML_with_python/blob/ayanbabusona-classwork/play_with_numpy_day3.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

*Continuing the NumPy Series, we are going to Learn ***Shape Manipulation***.
So, Let's Begin!!!*

### ***Shape Manipulation***
An array has a shape given by the number of elements along each axis:

In [4]:
import numpy as np
rg = np.random.default_rng(1)  # create instance of default random number generator
a = np.floor(10 * rg.random((3, 4)))
print(a)
print()
print(a.shape)

[[5. 9. 1. 9.]
 [3. 4. 8. 4.]
 [5. 0. 7. 5.]]

(3, 4)


*The shape of an array can be changed with various commands. Note that the following three commands all return a modified array, but do not change the original array:*

In [5]:
print(a.ravel())  # returns the array, flattened
print()
print(a.reshape(6, 2)) # returns the array with a modified shape
print()
print(a.T) # returns the array, transposed
print("Transpose shape: ", a.T.shape)
print("Original shape: ", a.shape)

[5. 9. 1. 9. 3. 4. 8. 4. 5. 0. 7. 5.]

[[5. 9.]
 [1. 9.]
 [3. 4.]
 [8. 4.]
 [5. 0.]
 [7. 5.]]

[[5. 3. 5.]
 [9. 4. 0.]
 [1. 8. 7.]
 [9. 4. 5.]]
Transpose shape:  (4, 3)
Original shape:  (3, 4)


*The order of the elements in the array resulting from ravel is normally “C-style”, that is, the rightmost index “changes the fastest”, so the element after a[0, 0] is a[0, 1]. If the array is reshaped to some other shape, again the array is treated as “C-style”. NumPy normally creates arrays stored in this order, so ravel will usually not need to copy its argument, but if the array was made by taking slices of another array or created with unusual options, it may need to be copied. The functions ravel and reshape can also be instructed, using an optional argument, to use FORTRAN-style arrays, in which the leftmost index changes the fastest.*

*The reshape function returns its argument with a modified shape, whereas the ndarray.resize method modifies the array itself:*

In [6]:
print(a)
a.resize((2, 6))
print(a)

[[5. 9. 1. 9.]
 [3. 4. 8. 4.]
 [5. 0. 7. 5.]]
[[5. 9. 1. 9. 3. 4.]
 [8. 4. 5. 0. 7. 5.]]


*If a dimension is given as -1 in a reshaping operation, the other dimensions are automatically calculated:*

In [7]:
a.reshape(3, -1)

array([[5., 9., 1., 9.],
       [3., 4., 8., 4.],
       [5., 0., 7., 5.]])

### ***Stacking together different arrays***
*Several arrays can be stacked together along different axes:*

In [9]:
a = np.floor(10 * rg.random((2, 2)))
print(a)
print()
b = np.floor(10 * rg.random((2, 2)))
print(b)
print()
print(np.vstack((a, b))) #vertical stacking
print()
print(np.hstack((a, b))) #horizontal stacking

[[7. 2.]
 [4. 9.]]

[[9. 7.]
 [5. 2.]]

[[7. 2.]
 [4. 9.]
 [9. 7.]
 [5. 2.]]

[[7. 2. 9. 7.]
 [4. 9. 5. 2.]]


*The function column_stack stacks 1D arrays as columns into a 2D array. It is equivalent to hstack only for 2D arrays:*

In [10]:
from numpy import newaxis
print(np.column_stack((a, b)))  # with 2D arrays
print()
a = np.array([4., 2.])
b = np.array([3., 8.])
print(np.column_stack((a, b))) # returns a 2D array
print()
print(np.hstack((a, b)))       # the result is different
print()
print(a[:, newaxis])           # view `a` as a 2D column vector
print()
print(np.column_stack((a[:, newaxis], b[:, newaxis])))
print()
print(np.hstack((a[:, newaxis], b[:, newaxis]))) # the result is the same

[[7. 2. 9. 7.]
 [4. 9. 5. 2.]]

[[4. 3.]
 [2. 8.]]

[4. 2. 3. 8.]

[[4.]
 [2.]]

[[4. 3.]
 [2. 8.]]

[[4. 3.]
 [2. 8.]]


*On the other hand, the function row_stack is equivalent to vstack for any input arrays. In fact, row_stack is an alias for vstack:*

In [11]:
np.column_stack is np.hstack

False

In [12]:
np.row_stack is np.vstack

True

*In general, for arrays with more than two dimensions, hstack stacks along their second axes, vstack stacks along their first axes, and concatenate allows for an optional arguments giving the number of the axis along which the concatenation should happen.*

***Note:*** *In complex cases, r_ and c_ are useful for creating arrays by stacking numbers along one axis. They allow the use of range literals :*

In [18]:
print(np.r_[1:4, 0, 4])
print()
print(np.c_[np.array([1,2,3]), np.array([4,5,6])])

[1 2 3 0 4]

[[1 4]
 [2 5]
 [3 6]]


*When used with arrays as arguments, r_ and c_ are similar to vstack and hstack in their default behavior, but allow for an optional argument giving the number of the axis along which to concatenate.*

### ***Splitting one array into several smaller ones***
*Using hsplit, you can split an array along its horizontal axis, either by specifying the number of equally shaped arrays to return, or by specifying the columns after which the division should occur:*

In [20]:
a = np.floor(10 * rg.random((2, 12)))
print(a)
print()
print(np.hsplit(a, 3)) # Split `a` into 3
print()
print(np.hsplit(a, (3, 4))) # Split `a` after the third and the fourth column

[[1. 8. 1. 0. 8. 8. 8. 4. 2. 0. 6. 7.]
 [8. 2. 2. 6. 8. 9. 1. 4. 8. 4. 5. 0.]]

[array([[1., 8., 1., 0.],
       [8., 2., 2., 6.]]), array([[8., 8., 8., 4.],
       [8., 9., 1., 4.]]), array([[2., 0., 6., 7.],
       [8., 4., 5., 0.]])]

[array([[1., 8., 1.],
       [8., 2., 2.]]), array([[0.],
       [6.]]), array([[8., 8., 8., 4., 2., 0., 6., 7.],
       [8., 9., 1., 4., 8., 4., 5., 0.]])]


*vsplit splits along the vertical axis, and array_split allows one to specify along which axis to split.*

## ***Copies and Views***
*When operating and manipulating arrays, their data is sometimes copied into a new array and sometimes not. This is often a source of confusion for beginners. There are three cases:*

### ***No Copy at All***
*Simple assignments make no copy of objects or their data.*

In [21]:
a = np.array([[ 0,  1,  2,  3], [ 4,  5,  6,  7], [ 8,  9, 10, 11]])
b = a
print(b is a)

True


*Python passes mutable objects as references, so function calls make no copy.*

In [22]:
def f(x):
  print(id(x))
print(id(a)) # id is a unique identifier of an object
f(a)

140303486961504
140303486961504


### ***View or Shallow Copy***
*Different array objects can share the same data. The view method creates a new array object that looks at the same data.*

In [23]:
c = a.view()
print(c)
print()
print(c is a)
print()
print(c.base is a)
print()
print(c.flags.owndata)
print()
c = c.reshape((2, 6))
print(c.shape)
print()
print(a.shape)
c[0, 4] = 1234
print(a)

[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]]

False

True

False

(2, 6)

(3, 4)
[[   0    1    2    3]
 [1234    5    6    7]
 [   8    9   10   11]]


*Slicing an array returns a view of it:*

In [24]:
s = a[:, 1:3]
print(s)
print()
s[:] = 10  # s[:] is a view of s. Note the difference between s = 10 and s[:] = 10
print(a)

[[ 1  2]
 [ 5  6]
 [ 9 10]]

[[   0   10   10    3]
 [1234   10   10    7]
 [   8   10   10   11]]


### ***Deep Copy***
*The copy method makes a complete copy of the array and its data.*

In [25]:
d = a.copy()  # a new array object with new data is created
print(d is a)
print()
print(d.base is a) # d doesn't share anything with a
print()
d[0, 0] = 9999
print(a)

False

False

[[   0   10   10    3]
 [1234   10   10    7]
 [   8   10   10   11]]


*Sometimes copy should be called after slicing if the original array is not required anymore. For example, suppose a is a huge intermediate result and the final result b only contains a small fraction of a, a deep copy should be made when constructing b with slicing:*

In [26]:
a = np.arange(int(1e8))
b = a[:100].copy()
del a  # the memory of ``a`` can be released.

***The advance indexing will be taught at next notebook!!!***
***Stay tuned!!!***