<a href="https://colab.research.google.com/github/sagunkayastha/CAI_Workshop/blob/main/Workshop_s2/DL_intro2.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
!wget -q https://raw.githubusercontent.com/sagunkayastha/CAI_Workshop/main/Workshop_s2/utils/utils.py

In [None]:
import numpy as np
import pandas as pd
np.random.seed(42)


***A Single Neuron***

!['Single_neuron'](https://raw.githubusercontent.com/sagunkayastha/CAI_Workshop/main/Workshop_s2/images/i1.png)

$$z = \sum_{i=1}^{n} x_i w_i + b$$

$$output = \sigma(z)$$


In [None]:
def sigmoid(x):
    return 1 / (1 + np.exp(-x))

def single_neuron(x, weights, bias):
    # Try looping through each element in the input vector and multiplying it by the corresponding weight
    z = np.dot(x, weights) + bias
    output = sigmoid(z)
    return output



x = np.array([0.5, 0.3])  # Input vector
weights = np.array([0.4, -0.2])  # Weights for each input
bias = np.array([0.1])  # Bias

output = single_neuron(x, weights, bias)
print("x shape:", x.shape, "weights shape:", weights.shape, "bias shape:", bias.shape)
print("Output of the single neuron:", output)

$$MSE=(real−output)^2$$



In [None]:

def loss_function(predicted, real):
    return (predicted - real) ** 2

loss = loss_function(output, 0.7)
print("Loss:", loss)


The gradients for updating the weights and bias are calculated using the chain rule as follows:

- Gradient with respect to weights:
  $$dLoss/dWeights = dLoss/dOutput \cdot dOutput/dZ \cdot dZ/dWeights$$
  
- Gradient with respect to bias:
  $$dLoss/dBias = dLoss/dOutput \cdot dOutput/dZ \cdot dZ/dBias$$


---------------


$$\frac{dLoss}{dOutput} = -2 \times (real - output)$$



$$\frac{dOutput}{dZ} = \sigma(z) \cdot (1 - \sigma(z))  = output \cdot (1 - output)$$


$$\frac{dZ}{dW_i} = x_i$$


---------------------


Using a learning rate eta update the weights and bias as follows:

$$w_i^{new} = w_i - \eta \cdot \frac{\partial L}{\partial w_i}$$
$$b^{new} = b - \eta \cdot \frac{\partial L}{\partial b}$$


In [None]:


def update_weights_and_bias(x, weights, bias, output, target, learning_rate):
    """Perform backpropagation and update the weights and bias."""
    # Compute the derivative of the loss with respect to output
    dLoss_dOutput = -(target - output)  # we ignore the factor of 2 for simplicity

    # Compute the derivative of the output with respect to z
    dOutput_dZ = output * (1 - output)

    # Compute the gradient of the loss with respect to weights
    dLoss_dWeights = dLoss_dOutput * dOutput_dZ * x

    # Compute the gradient of the loss with respect to bias
    dLoss_dBias = dLoss_dOutput * dOutput_dZ


    # Update the weights and bias
    weights -= learning_rate * dLoss_dWeights
    bias -= learning_rate * dLoss_dBias

    return weights, bias

In [None]:
updated_weights, updated_bias = update_weights_and_bias(x, weights, bias, output, 0.7, 0.1)

updated_output = single_neuron(x, updated_weights, updated_bias)
updated_loss = loss_function(updated_output, 0.7)


print("Previous output:", output, "Previous loss:", loss)
print("Updated output:", updated_output, "Updated loss:", updated_loss)

In [None]:
print("Previous weights:", weights, "Previous bias:", bias)
print("Updated weights:", updated_weights, "Updated bias:", updated_bias)

!['Two2_neuron'](https://raw.githubusercontent.com/sagunkayastha/CAI_Workshop/main/Workshop_s2/images/i2.png)

In [None]:
def two_layers(x, weights1, weights2, biases1, biases2):
    a12_a22 = sigmoid(np.dot(x, weights1) + biases1)
    ouptut = sigmoid(np.dot(a12_a22, weights2) + biases2)

    return output

def sigmoid(x):
    return 1 / (1 + np.exp(-x))

x = np.array([0.5, 0.3])

# Weights for each neuron, each column represents a single neuron
weights1 = np.array([[0.4, -0.2],
                    [0.1, 0.6], ])
weights2 = np.array([0.1, -0.2])  # Weights for the output layer
# Biases for each neuron
biases1 = np.array([0.1, -0.2])
biases2 = np.array([0.2])

output = two_layers(x, weights, weights2, biases1,  biases2)

print("x shape, weight1 shape", x.shape, weights1.shape)

----------------------

##### Trying similar model with generated dataset

In [None]:
from utils import generate_data, plot_data

In [None]:
x1, x2, y = generate_data(1000)
fig = plot_data(x1, x2, y)
fig.show()

***Normalization*** is a step in preparing data for machine learning that makes all the data similar in scale. This is important because:

- Helps Learn Faster: It makes the machine learning model learn and make predictions faster.
- Fair Treatment: Ensures every piece of data is treated equally by the model, so no single type of data overpowers others.
- Better Predictions: Leads to more accurate and stable predictions from the model.
- Works Well with Many Models: Some machine learning models need data to be normalized to work correctly.
- Avoids Problems: Prevents issues that can happen when data is in very different scales.

In [None]:
# normalize the data
from sklearn.preprocessing import MinMaxScaler
scaler = MinMaxScaler()
X1 = scaler.fit_transform(x1.reshape(-1, 1)).flatten()
X2 = scaler.fit_transform(x2.reshape(-1, 1)).flatten()

X = np.array([X1, X2]).T


In [None]:
#  initialize the weights and bias
weights = np.random.rand(2)
bias = np.random.rand(1)

# take a single example
single_x = X[50]
single_y = y[50]

output = single_neuron( single_x, weights, bias)
loss = loss_function(output, single_y)
print("Output:", output, "Loss:", loss)
updated_weights, updated_bias = update_weights_and_bias(single_x, weights, bias, output, single_y, 0.1)

updated_output = single_neuron(single_x, updated_weights, updated_bias)
updated_loss = loss_function(updated_output, single_y)
print("Previous output:", output, "Previous loss:", loss)
print("Updated output:", updated_output, "Updated loss:", updated_loss)

In [None]:
epoch_loss = 0
for single_x, single_y in zip(X, y):
    output = single_neuron( single_x, weights, bias)
    loss = loss_function(output, single_y)
    weights, bias = update_weights_and_bias(single_x, weights, bias, output, single_y, 0.1)

    # print("Previous output:", output, "Previous loss:", loss)
    # print("Updated output:", updated_output, "Updated loss:", updated_loss)
    epoch_loss += loss

epoch_loss = epoch_loss / len(X)
print("First Epoch loss:", epoch_loss)

epoch_loss = 0
for single_x, single_y in zip(X, y):
    output = single_neuron( single_x, weights, bias)
    loss = loss_function(output, single_y)
    weights, bias = update_weights_and_bias(single_x, weights, bias, output, single_y, 0.1)

    # print("Previous output:", output, "Previous loss:", loss)
    # print("Updated output:", updated_output, "Updated loss:", updated_loss)
    epoch_loss += loss

epoch_loss = epoch_loss / len(X)
print("Second Epoch loss:", epoch_loss)


In [None]:

np.random.seed(402)
weights = np.random.rand(2)
bias = np.random.rand(1)

epoch_losses = []
for epoch in range(100): # This is the number of times we iterate through the entire dataset

    epoch_loss = 0

    for single_x, single_y in zip(X, y): ## This is iterating through the entire dataset
        output = single_neuron( single_x, weights, bias)
        loss = loss_function(output, single_y)
        weights, bias = update_weights_and_bias(single_x, weights, bias, output, single_y, 0.01)

        # print("Previous output:", output, "Previous loss:", loss)
        # print("Updated output:", updated_output, "Updated loss:", updated_loss)
        epoch_loss += loss

    epoch_loss = epoch_loss / len(X)
    epoch_losses.append(epoch_loss)
    print(f"Epoch loss: {epoch}", epoch_loss[0])

##### Tensorflow model

In [None]:
import tensorflow as tf
from sklearn.model_selection import train_test_split
from tensorflow.keras.optimizers.experimental import SGD


np.random.seed(402)
tf.random.set_seed(42)
weights = np.random.rand(2)
bias = np.random.rand(1)

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=402)


# Define the model architecture
model = tf.keras.Sequential([
    tf.keras.layers.Dense(units=1, activation='sigmoid', input_shape=(2,),
                          kernel_initializer=tf.keras.initializers.Constant(weights),
                          bias_initializer=tf.keras.initializers.Constant(bias))
])
model.summary()



#### Number of Parameters
resnet50 25M

gpt-4 1.76 trillion parameters

llama2 7B, 13B, 70B

In [None]:

model.compile(optimizer='SGD', loss='mse')

# Train the model
model.fit(X_train, y_train, epochs=10, batch_size=1, validation_data=(X_test, y_test), validation_split=0.2)

In [None]:

# Define the model architecture
model = tf.keras.Sequential([
    tf.keras.layers.Dense(units=6, activation='relu', input_shape=(2,),
                          ),
    tf.keras.layers.Dense(units=3, activation='relu'),
    tf.keras.layers.Dense(units=1)

])
print(model.summary())

model.compile(optimizer='adam', loss='mse')

# Train the model
model.fit(X_train, y_train, epochs=50, batch_size=32, validation_split=0.2)


##### Machine Learning Model vs Neural Network

In [None]:
from sklearn.metrics import r2_score, mean_squared_error
y_pred = model.predict(X_test)
mse = mean_squared_error(y_test, y_pred)
print("Mean Squared Error for Neural Network :", mse)

In [None]:

from sklearn.ensemble import RandomForestRegressor
rf = RandomForestRegressor(n_estimators=100, n_jobs=-1)
rf.fit(X_train, y_train.ravel())
y_pred = rf.predict(X_test)

mse = mean_squared_error(y_test, y_pred)
print("Mean Squared Error for Random Forest:", mse)

### Parameters vs Hyperparameters

- Definition: Parameters are learned from data; hyperparameters are set before training.
- Role: Parameters make predictions; hyperparameters guide the learning process.
- Adjustment: Parameters adjust automatically; hyperparameters are chosen manually (or can use searched using algorithms).
- Examples: Parameters are weights/biases; hyperparameters include learning rate, epochs.
- Optimization: Parameters optimized during training; hyperparameters through testing various settings.

#### Try creating NN for bmi_data in data folder