# NORMALIZATION PROBLEM

##### Normalization is one of the most basic preprocessing techniques in data analytics. This involves centering and scaling process. Centering means subtracting the data from the mean and scaling means dividing with its standard deviation. Mathematically, normalization can be expressed as:  Z = (X-𝑥)/sd. In Python, element-wise mean and element-wise standard deviation can be obtained by using .mean() and.std() calls. In this problem, create a random 5 x 5 ndarray and store it to variable X. Normalize X. Save your normalized ndarray as X_normalized.npy

##### Why do we use normalization? 
##### - Model accuracy and performance are both improved by normalized data. It helps algorithms that use distance measures, such support vector machines by keeping features with bigger scales from controlling the learning process. Data redundancy may be resolved and various types of data organization can be achieved with the help of normalization.


In [30]:
# This line provides support for arrays or multidimensional array (ndarray)
import numpy as np

#Create a random 5x5 ndarray, storing it into X and then display
X = np.random.rand(5,5)
print("Random 5x5 ndarray: ")
print(X)

#Calculating the mean and sd of x
mean_of_X = X.mean()
std_of_X = X.std()

#Normalizing x using the given formula and displaying the result
X_normalized = (X - mean_of_X) / std_of_X
print("\nNormalized X: ")
print(X_normalized)

#Save the normalized ndarray 
np.save('X_normalized.npy', X_normalized)

Random 5x5 ndarray: 
[[0.59710917 0.30588285 0.32391373 0.49047258 0.83150912]
 [0.1229211  0.99224615 0.27504258 0.53338411 0.13024868]
 [0.50384816 0.13358147 0.21626588 0.72043821 0.59957076]
 [0.65373965 0.73555749 0.42341109 0.56515684 0.24705054]
 [0.82214312 0.21289679 0.50057632 0.05754141 0.50889845]]

Normalized X: 
[[ 0.55370345 -0.62355858 -0.55067004  0.12263248  1.50124889]
 [-1.3631687   2.15101713 -0.74822824  0.29609936 -1.33354746]
 [ 0.17670235 -1.3200749  -0.98582894  1.0522525   0.56365429]
 [ 0.78262823  1.11337112 -0.1484589   0.4245384  -0.86138409]
 [ 1.46338749 -0.99944826  0.16347615 -1.62746153  0.1971178 ]]


# DIVISIBLE BY 3 PROBLEM

##### Create a 10 x 10 ndarray, which are the squares of the first 100 positive integers. From this ndarray, determine all the elements that are divisible by 3. Save the result as div_by_3.npy

In [33]:
# This line provides support for arrays or multidimensional array (ndarray)
import numpy as np

def ndarray_and_div_by_3():
    # Create a 10x10 ndarray of the first 100 integers (.arange function), by squaring it (.square function) and arranging into a 10x10 ndarray (.reshape function)
    A = np.square(np.arange(1,101).reshape(10,10))
    print("A = ")
    print(A)

    #Finding elements in ndarray A that is divisible by 3
    div_by_3 = A[A % 3 == 0]
    print("\nElements divisible by 3: ")
    print(div_by_3)
    
    #Save the result of the elements that are divisble by 3 as div_by_3.npy
    np.save('div_by_3.npy',div_by_3)

ndarray_and_div_by_3()

A = 
[[    1     4     9    16    25    36    49    64    81   100]
 [  121   144   169   196   225   256   289   324   361   400]
 [  441   484   529   576   625   676   729   784   841   900]
 [  961  1024  1089  1156  1225  1296  1369  1444  1521  1600]
 [ 1681  1764  1849  1936  2025  2116  2209  2304  2401  2500]
 [ 2601  2704  2809  2916  3025  3136  3249  3364  3481  3600]
 [ 3721  3844  3969  4096  4225  4356  4489  4624  4761  4900]
 [ 5041  5184  5329  5476  5625  5776  5929  6084  6241  6400]
 [ 6561  6724  6889  7056  7225  7396  7569  7744  7921  8100]
 [ 8281  8464  8649  8836  9025  9216  9409  9604  9801 10000]]

Elements divisible by 3: 
[   9   36   81  144  225  324  441  576  729  900 1089 1296 1521 1764
 2025 2304 2601 2916 3249 3600 3969 4356 4761 5184 5625 6084 6561 7056
 7569 8100 8649 9216 9801]
