#

# NORMALIZATION

## Normalization is one of the most basic preprocessing techniques in
data analytics. This involves centering and scaling process. Centering means subtracting the data from the
mean and scaling means dividing with its standard deviation. Mathematically, normalization can be
expressed as: 𝑍 = ( 𝑋 − 𝑥̅ ) / 𝜎
In Python, element-wise mean and element-wise standard deviation can be obtained by using .mean() and
.std() calls.
In this problem, create a random 5 x 5 ndarray and store it to variable X. Normalize X. Save your normalized
ndarray as X_normalized.npy

In [351]:
import numpy as np
#Creates a random 5x5 ndarray and stored in the x variable. There's an additional parameter so that the random numbers produced is bounded from just 0 - 10.
x = np.random.randint(0,10,(5,5))
#The formula for normalization is stored in z variable.
#np.mean gets the mean of the ndarray x and np.std gets the standard deviation of the ndarray x
z = (x-(np.mean(x)))/np.std(x)
#the ndarray is then saved to 'normalized.npy'
np.save('normalized.npy', z) 
z

array([[-0.71603901,  1.23679465,  0.26037782,  1.23679465, -1.36698357],
       [ 1.23679465, -0.39056673, -1.04151129,  0.26037782,  0.5858501 ],
       [ 0.91132238,  0.91132238,  1.23679465, -1.04151129, -1.69245584],
       [ 0.26037782,  0.5858501 ,  0.91132238, -1.36698357,  0.5858501 ],
       [-1.36698357, -0.39056673, -1.69245584,  0.91132238, -0.06509446]])

## Verification
##### Expected output: The ndarray is successfully saved

In [354]:
#Loads the 'normalized.npy' file for verification purposes
o = np.load('normalized.npy')
# this checks if the ndarray stored at variable z is the same as the ndarray loaded and stored at variable o
if (z==o).all():
    print("The ndarray is successfully saved")
else:
    print("Saving ndarray unsuccessful")

The ndarray is successfully saved


#### Displays the generated random array for manual verification purposes

In [357]:
#Displays the random array contents
x

array([[3, 9, 6, 9, 1],
       [9, 4, 2, 6, 7],
       [8, 8, 9, 2, 0],
       [6, 7, 8, 1, 7],
       [1, 4, 0, 8, 5]])

# DIVISIBLE BY 3 PROBLEM: Create the following 10 x 10 ndarray.
𝐴 = [
 


1 4 ⋯
⋮ ⋮ ⋱ ⋮ ⋮
8281 8464 ⋯]9801 10000]




which are the squares of the first 100 positive integers.
From this ndarray, determine all the elements that are divisible by 3. Save the result as div_by_3.np

In [360]:

#Creates a 10x10 array that shows all the squares of the first 100 positive integers
Squared = (np.arange(1, 101)**2).reshape(10, 10)
#Filters the array to only show the numbers divisible by 3 and stores it to div3
div3 = Squared[Squared % 3 == 0]
#saves the div3 array to "div by 3.npy"
np.save('div by 3.npy', div3)
#To show the generated 10x10 ndarray 
Squared

array([[    1,     4,     9,    16,    25,    36,    49,    64,    81,
          100],
       [  121,   144,   169,   196,   225,   256,   289,   324,   361,
          400],
       [  441,   484,   529,   576,   625,   676,   729,   784,   841,
          900],
       [  961,  1024,  1089,  1156,  1225,  1296,  1369,  1444,  1521,
         1600],
       [ 1681,  1764,  1849,  1936,  2025,  2116,  2209,  2304,  2401,
         2500],
       [ 2601,  2704,  2809,  2916,  3025,  3136,  3249,  3364,  3481,
         3600],
       [ 3721,  3844,  3969,  4096,  4225,  4356,  4489,  4624,  4761,
         4900],
       [ 5041,  5184,  5329,  5476,  5625,  5776,  5929,  6084,  6241,
         6400],
       [ 6561,  6724,  6889,  7056,  7225,  7396,  7569,  7744,  7921,
         8100],
       [ 8281,  8464,  8649,  8836,  9025,  9216,  9409,  9604,  9801,
        10000]])

In [362]:
#loads the div by 3.npy to verify the answers
np.load('div by 3.npy')

array([   9,   36,   81,  144,  225,  324,  441,  576,  729,  900, 1089,
       1296, 1521, 1764, 2025, 2304, 2601, 2916, 3249, 3600, 3969, 4356,
       4761, 5184, 5625, 6084, 6561, 7056, 7569, 8100, 8649, 9216, 9801])

## Verification

##### Another method to show all the numbers that are divisible by 3 starting from 3 to 100

In [366]:
# defines a function the takes power, array_name, array_data as its' parameters.
#array_name variable is used for dynamic file naming
def tester(power, array_name, array_data):
    #this ensures that the array is wiped clean when the function is called
    array_data.clear()
    #another method that shows all the numbers that are divisible by 3 starting from 3 to 100
    for p in range(3,101,3):
        #If the power is 2, the answers will show the numbers that are divisible by 3 starting from 3 to 100
        #The power variable is initialized as an empty variable so that it can be changed later on for verification purposes
        v = p**power
        #appends the numbers to the array_data variable
        array_data.append(v)
    #Formats the saved file to the array_name
    np.save('{}.npy'.format(array_name), array_data)
    return np.load('{}.npy'.format(array_name))
#initializes an empty list for the test_array and the wrong_array
test_array = []
wrong_array = []
#calls the tester function to generate numbers for the test_array and the wrong_array
#the function with the power of 2 is the correct answer.
tester(1, "wrong_array", wrong_array)
tester(2, "test_array", test_array)

array([   9,   36,   81,  144,  225,  324,  441,  576,  729,  900, 1089,
       1296, 1521, 1764, 2025, 2304, 2601, 2916, 3249, 3600, 3969, 4356,
       4761, 5184, 5625, 6084, 6561, 7056, 7569, 8100, 8649, 9216, 9801])

#### Compares the test array to verify the answers if it's identical with the main array
##### Expected output: The main array is identical with the test array

In [369]:
def compare_and_verify_testarray():
    assert np.array_equal(test_array, div3)
    print("The main array is identical with the test array")
compare_and_verify_testarray()

The main array is identical with the test array


#### Compares the wrong array to verify if the code is working right
##### this will show an error since the main array is not equals with the wrong_array
##### Expected output: AssertionError

In [372]:
def compare_and_verify_wrongarray():
    assert np.array_equal(wrong_array, div3)
    print("The main array is identical with the test array")
compare_and_verify_wrongarray()

AssertionError: 