In [1]:
import numpy as np
import pandas as pd

***A Single Neuron***

!['Single_neuron'](images\i1.png)

$$z = \sum_{i=1}^{n} x_i w_i + b$$

$$output = \sigma(z)$$ 


In [2]:
def sigmoid(x):
    return 1 / (1 + np.exp(-x))

def single_neuron(x, weights, bias):
    # Try looping through each element in the input vector and multiplying it by the corresponding weight
    z = np.dot(x, weights) + bias
    output = sigmoid(z)
    return output  



x = np.array([0.5, 0.3])  # Input vector
weights = np.array([0.4, -0.2])  # Weights for each input
bias = np.array([0.1])  # Bias

output = single_neuron(x, weights, bias)
print("x shape:", x.shape, "weights shape:", weights.shape, "bias shape:", bias.shape)
print("Output of the single neuron:", output)

x shape: (2,) weights shape: (2,) bias shape: (1,)
Output of the single neuron: [0.55971365]


$$MSE=(real−output)^2$$



In [3]:

def loss_function(predicted, real):
    return (predicted - real) ** 2

loss = loss_function(output, 0.7)
print("Loss:", loss)


Loss: [0.01968026]


The gradients for updating the weights and bias are calculated using the chain rule as follows:

- Gradient with respect to weights: 
  $$dLoss/dWeights = dLoss/dOutput \cdot dOutput/dZ \cdot dZ/dWeights$$
  
- Gradient with respect to bias: 
  $$dLoss/dBias = dLoss/dOutput \cdot dOutput/dZ \cdot dZ/dBias$$


---------------


$$\frac{dLoss}{dOutput} = -2 \times (real - output)$$



$$\frac{dOutput}{dZ} = \sigma(z) \cdot (1 - \sigma(z))  = output \cdot (1 - output)$$


$$\frac{dZ}{dW_i} = x_i$$


---------------------


Using a learning rate eta update the weights and bias as follows:

$$w_i^{new} = w_i - \eta \cdot \frac{\partial L}{\partial w_i}$$
$$b^{new} = b - \eta \cdot \frac{\partial L}{\partial b}$$


In [4]:

    
def update_weights_and_bias(x, weights, bias, output, target, learning_rate):
    """Perform backpropagation and update the weights and bias."""
    # Compute the derivative of the loss with respect to output
    dLoss_dOutput = -(target - output)  # we ignore the factor of 2 for simplicity
    
    # Compute the derivative of the output with respect to z
    dOutput_dZ = output * (1 - output)
    
    # Compute the gradient of the loss with respect to weights
    dLoss_dWeights = dLoss_dOutput * dOutput_dZ * x
    
    # Compute the gradient of the loss with respect to bias
    dLoss_dBias = dLoss_dOutput * dOutput_dZ
    
    
    # Update the weights and bias
    weights -= learning_rate * dLoss_dWeights
    bias -= learning_rate * dLoss_dBias
    
    return weights, bias

In [5]:
updated_weights, updated_bias = update_weights_and_bias(x, weights, bias, output, 0.7, 0.1)

updated_output = single_neuron(x, updated_weights, updated_bias)
updated_loss = loss_function(updated_output, 0.7)


print("Previous output:", output, "Previous loss:", loss)
print("Updated output:", updated_output, "Updated loss:", updated_loss)

Previous output: [0.55971365] Previous loss: [0.01968026]
Updated output: [0.56085495] Updated loss: [0.01936134]


In [6]:
print("Previous weights:", weights, "Previous bias:", bias)
print("Updated weights:", updated_weights, "Updated bias:", updated_bias)

Previous weights: [ 0.40172857 -0.19896286] Previous bias: [0.10345714]
Updated weights: [ 0.40172857 -0.19896286] Updated bias: [0.10345714]


!['Two2_neuron'](images\i2.png)

In [7]:
def two_layers(x, weights1, weights2, biases1, biases2):
    a12_a22 = sigmoid(np.dot(x, weights1) + biases1)
    ouptut = sigmoid(np.dot(a12_a22, weights2) + biases2)
    
    return output

def sigmoid(x):
    return 1 / (1 + np.exp(-x))

x = np.array([0.5, 0.3])  

# Weights for each neuron, each column represents a single neuron
weights1 = np.array([[0.4, -0.2], 
                    [0.1, 0.6], ])
weights2 = np.array([0.1, -0.2])  # Weights for the output layer
# Biases for each neuron
biases1 = np.array([0.1, -0.2])
biases2 = np.array([0.2])

output = two_layers(x, weights, weights2, biases1,  biases2)

print("x shape, weight1 shape", x.shape, weights1.shape)

x shape, weight1 shape (2,) (2, 2)


----------------------

##### Trying similar model with generated dataset

In [8]:
from utils.dataset import generate_data, plot_data

In [23]:
x1, x2, y = generate_data(1000)
fig = plot_data(x1, x2, y)
fig.show()

***Normalization*** is a step in preparing data for machine learning that makes all the data similar in scale. This is important because:

- Helps Learn Faster: It makes the machine learning model learn and make predictions faster.
- Fair Treatment: Ensures every piece of data is treated equally by the model, so no single type of data overpowers others.
- Better Predictions: Leads to more accurate and stable predictions from the model.
- Works Well with Many Models: Some machine learning models need data to be normalized to work correctly.
- Avoids Problems: Prevents issues that can happen when data is in very different scales.

In [24]:
# normalize the data
from sklearn.preprocessing import MinMaxScaler
scaler = MinMaxScaler()
X1 = scaler.fit_transform(x1.reshape(-1, 1)).flatten()
X2 = scaler.fit_transform(x2.reshape(-1, 1)).flatten()




In [27]:
X = np.array([X1, X2]).T

In [28]:
#  initialize the weights and bias
weights = np.random.rand(2)
bias = np.random.rand(1)

# take a single example
single_x = X[50]
single_y = y[50]

output = single_neuron( single_x, weights, bias)
loss = loss_function(output, single_y)
print("Output:", output, "Loss:", loss)
updated_weights, updated_bias = update_weights_and_bias(single_x, weights, bias, output, single_y, 0.1)

updated_output = single_neuron(single_x, updated_weights, updated_bias)
updated_loss = loss_function(updated_output, single_y)
print("Previous output:", output, "Previous loss:", loss)
print("Updated output:", updated_output, "Updated loss:", updated_loss)

Output: [0.57197649] Loss: [575.87042165]
Previous output: [0.57197649] Previous loss: [575.87042165]
Updated output: [0.72215353] Updated loss: [568.68528785]


In [29]:
epoch_loss = 0
for single_x, single_y in zip(X, y):
    output = single_neuron( single_x, weights, bias)
    loss = loss_function(output, single_y)
    weights, bias = update_weights_and_bias(single_x, weights, bias, output, single_y, 0.1)
    
    # print("Previous output:", output, "Previous loss:", loss)
    # print("Updated output:", updated_output, "Updated loss:", updated_loss)
    epoch_loss += loss
    
epoch_loss = epoch_loss / len(X)
print("First Epoch loss:", epoch_loss)

epoch_loss = 0
for single_x, single_y in zip(X, y):
    output = single_neuron( single_x, weights, bias)
    loss = loss_function(output, single_y)
    weights, bias = update_weights_and_bias(single_x, weights, bias, output, single_y, 0.1)
    
    # print("Previous output:", output, "Previous loss:", loss)
    # print("Updated output:", updated_output, "Updated loss:", updated_loss)
    epoch_loss += loss
    
epoch_loss = epoch_loss / len(X)
print("Second Epoch loss:", epoch_loss)


First Epoch loss: [9894.13668067]
Second Epoch loss: [9894.03738029]


In [31]:

np.random.seed(402)
weights = np.random.rand(2)
bias = np.random.rand(1)

epoch_losses = []
for epoch in range(100): # This is the number of times we iterate through the entire dataset
    
    epoch_loss = 0
    
    for single_x, single_y in zip(X, y): ## This is iterating through the entire dataset
        output = single_neuron( single_x, weights, bias)
        loss = loss_function(output, single_y)
        weights, bias = update_weights_and_bias(single_x, weights, bias, output, single_y, 0.01)
        
        # print("Previous output:", output, "Previous loss:", loss)
        # print("Updated output:", updated_output, "Updated loss:", updated_loss)
        epoch_loss += loss
        
    epoch_loss = epoch_loss / len(X)
    epoch_losses.append(epoch_loss)
    print(f"Epoch loss: {epoch}", epoch_loss[0])

Epoch loss: 0 9894.917817161488
Epoch loss: 1 9894.122350897023
Epoch loss: 2 9894.08355052273
Epoch loss: 3 9894.067507193451
Epoch loss: 4 9894.058657694932
Epoch loss: 5 9894.053031618389
Epoch loss: 6 9894.049132990609
Epoch loss: 7 9894.046269440552
Epoch loss: 8 9894.04407571628
Epoch loss: 9 9894.042340648171
Epoch loss: 10 9894.040933549057
Epoch loss: 11 9894.039769161758
Epoch loss: 12 9894.03878947408
Epoch loss: 13 9894.03795362876
Epoch loss: 14 9894.037232012004
Epoch loss: 15 9894.03660263057
Epoch loss: 16 9894.036048806145
Epoch loss: 17 9894.035557659661
Epoch loss: 18 9894.03511908653
Epoch loss: 19 9894.03472504707
Epoch loss: 20 9894.034369064004
Epoch loss: 21 9894.034045860259
Epoch loss: 22 9894.033751092851
Epoch loss: 23 9894.033481154927
Epoch loss: 24 9894.033233025786
Epoch loss: 25 9894.033004156481
Epoch loss: 26 9894.032792381058
Epoch loss: 27 9894.032595847124
Epoch loss: 28 9894.032412961089
Epoch loss: 29 9894.032242344312
Epoch loss: 30 9894.0320827

##### Tensorflow model

In [32]:
import tensorflow as tf
from sklearn.model_selection import train_test_split
from tensorflow.keras.optimizers.experimental import SGD


np.random.seed(402)
weights = np.random.rand(2)
bias = np.random.rand(1)

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=402)


# Define the model architecture
model = tf.keras.Sequential([
    tf.keras.layers.Dense(units=1, activation='sigmoid', input_shape=(2,),
                          kernel_initializer=tf.keras.initializers.Constant(weights),
                          bias_initializer=tf.keras.initializers.Constant(bias))
])
model.summary()



Model: "sequential_3"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense_7 (Dense)             (None, 1)                 3         
                                                                 
Total params: 3 (12.00 Byte)
Trainable params: 3 (12.00 Byte)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________


#### Number of Parameters
resnet50 25M 
 
gpt-4 1.76 trillion parameters

llama2 7B, 13B, 70B

In [34]:

model.compile(optimizer='SGD', loss='mse')

# Train the model
model.fit(X_train, y_train, epochs=10, batch_size=1, validation_data=(X_test, y_test), validation_split=0.2)

Epoch 1/10
174/800 [=====>........................] - ETA: 0s - loss: 11106.3359

Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<keras.src.callbacks.History at 0x1f43292e8e0>

In [35]:
import tensorflow as tf
from sklearn.model_selection import train_test_split
from tensorflow.keras.optimizers.experimental import SGD
np.random.seed(402)

# Define the model architecture
model = tf.keras.Sequential([
    tf.keras.layers.Dense(units=6, activation='relu', input_shape=(2,),
                          ),
    tf.keras.layers.Dense(units=3, activation='relu'),
    tf.keras.layers.Dense(units=1)
    
])
print(model.summary())

model.compile(optimizer='adam', loss='mse')

# Train the model
model.fit(X_train, y_train, epochs=50, batch_size=32, validation_split=0.2)


Model: "sequential_4"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense_8 (Dense)             (None, 6)                 18        
                                                                 
 dense_9 (Dense)             (None, 3)                 21        
                                                                 
 dense_10 (Dense)            (None, 1)                 4         
                                                                 
Total params: 43 (172.00 Byte)
Trainable params: 43 (172.00 Byte)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________
None
Epoch 1/50


Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50


<keras.src.callbacks.History at 0x1f4329d3520>

##### Machine Learning Model vs Neural Network

In [45]:
from sklearn.metrics import r2_score, mean_squared_error
y_pred = model.predict(X_test)
mse = mean_squared_error(y_test, y_pred)
print("Mean Squared Error for Neural Network :", mse)

Mean Squared Error for Neural Network : 2081.6233585866016


In [46]:

from sklearn.ensemble import RandomForestRegressor
rf = RandomForestRegressor(n_estimators=100, n_jobs=-1)
rf.fit(X_train, y_train.ravel())
y_pred = rf.predict(X_test)

mse = mean_squared_error(y_test, y_pred)
print("Mean Squared Error for Random Forest:", mse)

Mean Squared Error for Random Forest: 6.993473272646764


### Parameters vs Hyperparameters

- Definition: Parameters are learned from data; hyperparameters are set before training.
- Role: Parameters make predictions; hyperparameters guide the learning process.
- Adjustment: Parameters adjust automatically; hyperparameters are chosen manually (or can use searched using algorithms).
- Examples: Parameters are weights/biases; hyperparameters include learning rate, epochs.
- Optimization: Parameters optimized during training; hyperparameters through testing various settings.

#### Try creating NN for bmi_data in data folder