## Reshaping Numpy Array Numerical Column/Features before Normalization

##### The main Principle that I am going to discuss with example below is this

### For normalizing numerical data, we got to use reshape(1, -1) and NOT reshape(-1, 1) and then apply Normalize() function. Ofcourse after the Normalization is done we can rehape it back with (-1, 1) if thats what we need.

Here is how normalizer works and why you should use reshape(1, -1) instead of (-1, 1)

### Normalizer by default normalizes on each sample(row) while StandardScaler standardises on each feature(column)

---

### What does -1 mean in numpy reshape() ?

The "-1" stands for "unknown dimension" which can and should be infered from another dimension.

#### Main Rule is - when we don't how many columns the resultant matrix should have (set it to -1!), but if we know that we want a 1-dimensional array(set the first parameter to 1!).

---

Lets take an example where my array is `np.array([1, 2, 3])`

### Noting the mechanism of reshape() - that numpy allow us to give one of new shape parameter as -1 (eg: (2,-1) or (-1,3) but not (-1, -1)). It simply means that it is an unknown dimension and we want numpy to figure it out. And numpy will figure this by looking at the 'length of the array and remaining dimensions' and making sure it satisfies the above mentioned criteria

Here if I use (-1, 1) it means any number of rows, which is the responsibility of numpy to figure out, while I am specifying that I need to have one column. 

Rememeber -1 lets numpy to calculate the unknown dimension for the resultant that will match with the original matrix.

And vice versa, if I do reshape(1, -1) means, that I am specifying row to be 1 while leaving column numbers to be calculated by Numpy.

So for the case, that I use (-1, 1) => i.e. Rows are unknows while columns required is 1

![Imgur](https://imgur.com/vv8tAQK.png)

Normalizer() will go each sample wise i.e. row wise and calculate the Normalizer formulae for that row.


Lets do actual example

In [6]:
from sklearn.preprocessing import Normalizer
import numpy as np

a = np.array([1, 2, 3])

b = np.reshape(a, (-1, 1))

c = np.reshape(a, (1, -1))

print('Afer reshaping with (-1, 1) \n', b)
print('After reshaping with (1, -1) \n', c)


Afer reshaping with (-1, 1) 
 [[1]
 [2]
 [3]]
After reshaping with (1, -1) 
 [[1 2 3]]


In [7]:
normalizer = Normalizer()

a_transformed_not_useful = normalizer.transform(b)
c_transformed = normalizer.transform(c)

print("a_transformed_not_useful \n \n", a_transformed_not_useful)
print("c_transformed \n \n", c_transformed)

a_transformed_not_useful 
 
 [[1.]
 [1.]
 [1.]]
c_transformed 
 
 [[0.26726124 0.53452248 0.80178373]]


As we can see the result of the reshape `a_transformed_not_useful` is a column vector with each value of 1 which in most cases is not useful for our modelling. Beause Normalizer() will go each sample wise i.e. row wise and calculate the Normalizer formulae for that row.
