In this workshop, you'll predict the salary of an applicant, given the data of other
applicants and the salary they're offered.

- `column 1:` [Microsoft Python Certification Exam][1] score
- `column 3:` years experience
- `column 4:` monthly salary offered, $x100,000$ pesos

> 👀 large values like the salary need to be scaled down, otherwise your MSE will be very very high
> Remember that before getting the average, we get the SUM first
> and sum of square of LARGE numbers can go towards $\infty$

[1]: https://www.udemy.com/course/microsoft-python-certification-exam-98-381-practice-tests/

In [238]:
import numpy as np
from sklearn.preprocessing import MinMaxScaler

import tensorflow as tf
from tensorflow import keras

# display 3 places, suppress scientific notation
np.set_printoptions(precision=3, suppress=True)

scores    = [40, 35, 28, 31, 33, 35, 35, 39, 29, 38, 40, 32, 40, 29, 28]
years_exp = [ 7,  3, 10,  8,  5, 12,  9, 10,  6,  7,  4,  4, 10,  5,  1]

X = np.hstack((np.array([scores]).T, np.array([years_exp]).T))

In [251]:
y = np.array([scores]).T * 2.8 + \
    np.array([years_exp]).T * 0.7 

y /= 100

np.hstack((X, y))

array([[40.   ,  7.   ,  5.   ,  1.169],
       [35.   ,  3.   ,  4.   ,  1.001],
       [28.   , 10.   ,  2.   ,  0.854],
       [31.   ,  8.   ,  2.   ,  0.924],
       [33.   ,  5.   ,  0.   ,  0.959],
       [35.   , 12.   ,  5.   ,  1.064],
       [35.   ,  9.   ,  4.   ,  1.043],
       [39.   , 10.   ,  1.   ,  1.162],
       [29.   ,  6.   ,  2.   ,  0.854],
       [38.   ,  7.   ,  0.   ,  1.113],
       [40.   ,  4.   ,  1.   ,  1.148],
       [32.   ,  4.   ,  4.   ,  0.924],
       [40.   , 10.   ,  3.   ,  1.19 ],
       [29.   ,  5.   ,  3.   ,  0.847],
       [28.   ,  1.   ,  0.   ,  0.791]])

In [240]:
# score, exp, ML, salary
data = np.array([
    [40.   ,  7.   ,  5.   ,  1.169],
    [35.   ,  3.   ,  4.   ,  1.001],
    [28.   , 10.   ,  2.   ,  0.854],
    [31.   ,  8.   ,  2.   ,  0.924],
    [33.   ,  5.   ,  0.   ,  0.959],
    [35.   , 12.   ,  5.   ,  1.064],
    [35.   ,  9.   ,  4.   ,  1.043],
    [39.   , 10.   ,  1.   ,  1.162],
    [29.   ,  6.   ,  2.   ,  0.854],
    [38.   ,  7.   ,  0.   ,  1.113],
    [40.   ,  4.   ,  1.   ,  1.148],
    [32.   ,  4.   ,  4.   ,  0.924],
    [40.   , 10.   ,  3.   ,  1.19 ],
    [29.   ,  5.   ,  3.   ,  0.847],
    [28.   ,  1.   ,  0.   ,  0.791]])

data

array([[40.   ,  7.   ,  5.   ,  1.122],
       [35.   ,  3.   ,  4.   ,  0.96 ],
       [28.   , 10.   ,  2.   ,  0.804],
       [31.   ,  8.   ,  2.   ,  0.87 ],
       [33.   ,  5.   ,  0.   ,  0.888],
       [35.   , 12.   ,  5.   ,  1.022],
       [35.   ,  9.   ,  4.   ,  0.996],
       [39.   , 10.   ,  1.   ,  1.082],
       [29.   ,  6.   ,  2.   ,  0.806],
       [38.   ,  7.   ,  0.   ,  1.03 ],
       [40.   ,  4.   ,  1.   ,  1.072],
       [32.   ,  4.   ,  4.   ,  0.888],
       [40.   , 10.   ,  3.   ,  1.124],
       [29.   ,  5.   ,  3.   ,  0.808],
       [28.   ,  1.   ,  0.   ,  0.734]])

Verifying we sliced $X$ correctly.

In [241]:
X = data[:, 0:3]
X

array([[40.,  7.,  5.],
       [35.,  3.,  4.],
       [28., 10.,  2.],
       [31.,  8.,  2.],
       [33.,  5.,  0.],
       [35., 12.,  5.],
       [35.,  9.,  4.],
       [39., 10.,  1.],
       [29.,  6.,  2.],
       [38.,  7.,  0.],
       [40.,  4.,  1.],
       [32.,  4.,  4.],
       [40., 10.,  3.],
       [29.,  5.,  3.],
       [28.,  1.,  0.]])

Verifying we sliced $y$ correctly.

In [242]:
y = data[:, [-1]]
y

array([[1.122],
       [0.96 ],
       [0.804],
       [0.87 ],
       [0.888],
       [1.022],
       [0.996],
       [1.082],
       [0.806],
       [1.03 ],
       [1.072],
       [0.888],
       [1.124],
       [0.808],
       [0.734]])

Displaying the lowest and highest of each of the following:
- certification score
- years experience
- ML rating

In [243]:
# first the lowest of each
np.min(X, axis=0) # 0 = x-axis, 1 = y-axis

array([28.,  1.,  0.])

In [244]:
# then the highest of each
np.max(X, axis=0)

array([40., 12.,  5.])

Then we scale using Scikit-learn's scaler, which is easier.  We can also use numpy's vectorization, that's one less library to learn.

Note that:
- for certification scores of `40`, they become `1`s, `28` become `0`s.
- for years experience of `1`, they become `0`s, `12` becomes `1`s.
- for ML rating, `0` are `0`s, `5` are `1`s, middle values `2` and `3` become `0.4` and `0.6`.

In [245]:
scaler = MinMaxScaler()
X_norm = scaler.fit_transform(X)

# Showing the normalized values
X

array([[40.,  7.,  5.],
       [35.,  3.,  4.],
       [28., 10.,  2.],
       [31.,  8.,  2.],
       [33.,  5.,  0.],
       [35., 12.,  5.],
       [35.,  9.,  4.],
       [39., 10.,  1.],
       [29.,  6.,  2.],
       [38.,  7.,  0.],
       [40.,  4.,  1.],
       [32.,  4.,  4.],
       [40., 10.,  3.],
       [29.,  5.,  3.],
       [28.,  1.,  0.]])

In [246]:
# and the normalized ones
X_norm

array([[1.   , 0.545, 1.   ],
       [0.583, 0.182, 0.8  ],
       [0.   , 0.818, 0.4  ],
       [0.25 , 0.636, 0.4  ],
       [0.417, 0.364, 0.   ],
       [0.583, 1.   , 1.   ],
       [0.583, 0.727, 0.8  ],
       [0.917, 0.818, 0.2  ],
       [0.083, 0.455, 0.4  ],
       [0.833, 0.545, 0.   ],
       [1.   , 0.273, 0.2  ],
       [0.333, 0.273, 0.8  ],
       [1.   , 0.818, 0.6  ],
       [0.083, 0.364, 0.6  ],
       [0.   , 0.   , 0.   ]])

In [247]:
model = keras.models.Sequential([
  keras.layers.Dense(units=1, input_shape=(3,))
])
model.compile(
  loss=keras.losses.MeanSquaredError(),
  optimizer=keras.optimizers.Adam(learning_rate=0.001)
)
model.fit(X_norm, y, epochs=500)

Epoch 1/500
Epoch 2/500
Epoch 3/500
Epoch 4/500
Epoch 5/500
Epoch 6/500
Epoch 7/500
Epoch 8/500
Epoch 9/500
Epoch 10/500
Epoch 11/500
Epoch 12/500
Epoch 13/500
Epoch 14/500
Epoch 15/500
Epoch 16/500
Epoch 17/500
Epoch 18/500
Epoch 19/500
Epoch 20/500
Epoch 21/500
Epoch 22/500
Epoch 23/500
Epoch 24/500
Epoch 25/500
Epoch 26/500
Epoch 27/500
Epoch 28/500
Epoch 29/500
Epoch 30/500
Epoch 31/500
Epoch 32/500
Epoch 33/500
Epoch 34/500
Epoch 35/500
Epoch 36/500
Epoch 37/500
Epoch 38/500
Epoch 39/500
Epoch 40/500
Epoch 41/500
Epoch 42/500
Epoch 43/500
Epoch 44/500
Epoch 45/500
Epoch 46/500
Epoch 47/500
Epoch 48/500
Epoch 49/500
Epoch 50/500
Epoch 51/500
Epoch 52/500
Epoch 53/500
Epoch 54/500
Epoch 55/500
Epoch 56/500
Epoch 57/500
Epoch 58/500
Epoch 59/500
Epoch 60/500
Epoch 61/500
Epoch 62/500
Epoch 63/500
Epoch 64/500
Epoch 65/500
Epoch 66/500
Epoch 67/500
Epoch 68/500
Epoch 69/500
Epoch 70/500
Epoch 71/500
Epoch 72/500
Epoch 73/500
Epoch 74/500
Epoch 75/500
Epoch 76/500
Epoch 77/500
Epoch 78

<keras.callbacks.History at 0x7f54944b6050>

In [248]:
# Showing our training data again, so we can compare
data

array([[40.   ,  7.   ,  5.   ,  1.122],
       [35.   ,  3.   ,  4.   ,  0.96 ],
       [28.   , 10.   ,  2.   ,  0.804],
       [31.   ,  8.   ,  2.   ,  0.87 ],
       [33.   ,  5.   ,  0.   ,  0.888],
       [35.   , 12.   ,  5.   ,  1.022],
       [35.   ,  9.   ,  4.   ,  0.996],
       [39.   , 10.   ,  1.   ,  1.082],
       [29.   ,  6.   ,  2.   ,  0.806],
       [38.   ,  7.   ,  0.   ,  1.03 ],
       [40.   ,  4.   ,  1.   ,  1.072],
       [32.   ,  4.   ,  4.   ,  0.888],
       [40.   , 10.   ,  3.   ,  1.124],
       [29.   ,  5.   ,  3.   ,  0.808],
       [28.   ,  1.   ,  0.   ,  0.734]])

In [249]:
scaled_input = scaler.transform(np.array([
    [35, 5, 2],
    [28, 1, 0],
]))

model.predict(scaled_input) * 100_000

array([[ 1889.983],
       [42659.387]])

In [250]:
model.weights

[<tf.Variable 'dense_28/kernel:0' shape=(3, 1) dtype=float32, numpy=
 array([[-0.192],
        [-0.72 ],
        [-0.085]], dtype=float32)>,
 <tf.Variable 'dense_28/bias:0' shape=(1,) dtype=float32, numpy=array([0.427], dtype=float32)>]