### **Logistic Regresion is used for classification unlike the Liniear Regression which is used to predict numeric values.**

In [None]:
from google.colab import drive

drive.mount('/content/gdrive/', force_remount=True)

Mounted at /content/gdrive/


In [None]:
%cd /content/gdrive/MyDrive/SLIIT/Data_Science/Data_Science_Projects/Coursera Projects/TensorFlow/IBM Course of TensorFlow

/content/gdrive/MyDrive/SLIIT/Data_Science/Data_Science_Projects/Coursera Projects/TensorFlow/IBM Course of TensorFlow


In [None]:
#import necessary libraries

import tensorflow as tf
print("TensorFlow version:", tf.__version__)
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.datasets import load_iris #Lets take a dataset from kslearn itself

TensorFlow version: 2.15.0


In [None]:
#get the dataset into iris variable
iris = load_iris()

#Taking all the records apart from the last raw
iris_X, iris_y = iris.data[:-1,:], iris.target[:-1]

#applying one-hot encoding on the dependent variable
iris_y= pd.get_dummies(iris_y).values

#split the dataset into training & testing data
trainX, testX, trainY, testY = train_test_split(iris_X, iris_y, test_size=0.33, random_state=42)

The reason for using one-hot encoding in this scenario, even though the labels are represented by numbers, is to prevent the machine learning model from misinterpreting the numerical labels as having some sort of ordinal or quantitative relationship. Here's a more detailed explanation:

Avoiding Implicit Order or Magnitude: When you have categorical data represented as numbers (like 0, 1, 2 for different species), a machine learning model might incorrectly assume that these numbers imply an order (like 2 > 1 > 0) or a magnitude relationship. This is not the case here, as the numbers are purely nominal labels.

Treating Each Category Equally: One-hot encoding converts each categorical value into a binary vector. This ensures that each category (species in this case) is treated equally by the model. If we just use 0, 1, 2, the model might infer that the category represented by 2 is somehow 'twice' that of the category represented by 1, which doesn't make sense in this context.

Improving Model Performance: Many machine learning models, especially those based on linear assumptions, perform better when categorical data is one-hot encoded. This is because one-hot encoding creates a better feature space that models can leverage to differentiate between the categories.

For example, in the Iris dataset:

Setosa might be encoded as [1, 0, 0]
Versicolour as [0, 1, 0]
Virginica as [0, 0, 1]

In this way, each species is represented by a distinct vector, and the model can learn separate weights for each species without assuming any numerical relationship between them.

In [None]:
# numFeatures is the number of features in our input data.
# In the iris dataset, this number is '4'.
numFeatures = trainX.shape[1]
print('numFeatures is : ', numFeatures)
# numLabels is the number of classes our data points can be in.
# In the iris dataset, this number is '3'.
numLabels = trainY.shape[1]
print('numLabels is : ', numLabels )

#X = tf.Variable( np.identity(numFeatures), tf.TensorShape(numFeatures),dtype='float32') # Iris has 4 features, so X is a tensor to hold our data.
#yGold = tf.Variable(np.array([1,1,1]),shape=tf.TensorShape(numLabels),dtype='float32') # This will be our correct answers matrix for 3 classes.

numFeatures is :  4
numLabels is :  3


In [None]:
#Lets convert training and testing sets into TensorFlow objects
trainX = tf.constant(trainX, dtype='float32')
trainY = tf.constant(trainY, dtype='float32')
testX = tf.constant(testX, dtype='float32')
testY = tf.constant(testY, dtype='float32')

### **Let's Initialize weights**

In [None]:
W = tf.Variable(tf.zeros([4, 3]))  # 4-dimensional input and  3 classes
b = tf.Variable(tf.zeros([3])) # 3-dimensional output [0,0,1],[0,1,0],[1,0,0]

In machine learning, particularly in a classification task using a model like logistic regression or a basic neural network, the weights (W) and biases (b) play a crucial role. The shapes of W and b are determined based on the structure of the input data and the desired structure of the output data. Let's break down why W and b are shaped this way in your example:

  Weights Matrix W:
        Shape: The shape of W is [4, 3]. This is because you have a 4-dimensional input and 3 classes to predict. Each feature in the input data must be connected to each class in the output.
        Reasoning: In the Iris dataset, there are 4 features (sepal length, sepal width, petal length, petal width). For each class (Setosa, Versicolour, Virginica), the model learns a weight for each feature. Thus, you have 4 features × 3 classes = 12 weight parameters in total.

  Biases Vector b:
        Shape: The shape of b is [3]. This corresponds to the 3 classes in the output.
        Reasoning: The bias allows the model to shift the output function to better fit the data. Each class has its own bias, allowing the model to adjust the output for each class independently.

  How They Work Together:
        When making a prediction, the input data (a 4-dimensional vector for each instance) is multiplied by the weights matrix W and then the bias b is added. Mathematically, this can be represented as Y = XW + b, where X is your input data.
        This operation results in a 3-dimensional vector for each instance, where each dimension corresponds to the model's prediction for each of the three classes. The values are then typically passed through a softmax function to convert them into probabilities.

  Example:
        Suppose you have a single instance of Iris data: [5.1, 3.5, 1.4, 0.2] (4-dimensional).
        This instance is multiplied by W (a 4×3 matrix), resulting in a temporary 3-dimensional vector.
        Then, b (a 3-dimensional vector) is added to this temporary vector, resulting in the final 3-dimensional output.

In summary, the shape [4, 3] for W and [3] for b directly corresponds to the dimensions of the input data (4 features) and the output predictions (3 possible classes). This setup is crucial for the model to learn the appropriate transformations from inputs to outputs.

In [None]:
#Randomly sample from a normal distribution with standard deviation .01
#This is to initialize the W & b for a neural network
weights = tf.Variable(tf.random.normal([numFeatures,numLabels],
                                       mean=0.,
                                       stddev=0.01,
                                       name="weights"),dtype='float32')

bias = tf.Variable(tf.random.normal([1,numLabels],
                                    mean=0.,
                                    stddev=0.01,
                                    name="bias"))

### **Now Let's define the hypothesis model**

In [None]:
# Three-component breakdown of the Logistic Regression equation.
# Note that these feed into each other.
def logistic_regression(x):
    #x is the input & it is first multiplied by weights
    apply_weights_OP = tf.matmul(x, weights, name="apply_weights")

    #then the result will be added to the biase
    add_bias_OP = tf.add(apply_weights_OP, bias, name="add_bias")

    #then the result will be sent into the sigmoid function
    activation_OP = tf.nn.sigmoid(add_bias_OP, name="activation")

    return activation_OP

### **Next, lets define the cost function & other functions**

In [None]:
# Number of Epochs in our training
numEpochs = 700

# Defining our learning rate iterations (decay)
learningRate = tf.keras.optimizers.schedules.ExponentialDecay(initial_learning_rate=0.0008,
                                          decay_steps=trainX.shape[0],
                                          decay_rate= 0.95,
                                          staircase=True)

#Defining our cost function - Squared Mean Error
loss_object = tf.keras.losses.MeanSquaredLogarithmicError()
optimizer = tf.keras.optimizers.SGD(learningRate)

# Accuracy metric.
def accuracy(y_pred, y_true):
# Predicted class is the index of the highest score in prediction vector (i.e. argmax).
    correct_prediction = tf.equal(tf.argmax(y_pred, 1), tf.argmax(y_true, 1))
    return tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

#tf.argmax(y_pred, 1) finds the index of the highest value in each prediction vector. In classification tasks, each element of y_pred is a vector where each value represents the model's confidence that the input corresponds to a particular class.
#tf.argmax effectively chooses the class with the highest confidence as the model's prediction.
#The 1 in tf.argmax(y_pred, 1) refers to the axis along which to find the argmax. If your prediction is a 2D tensor (batch_size x number_of_classes), axis 1 refers to the number_of_classes axis.

#    tf.cast(correct_prediction, tf.float32) converts the boolean tensor to a float tensor (True becomes 1.0 and False becomes 0.0).
#    tf.reduce_mean(tf.cast(correct_prediction, tf.float32)) calculates the mean of these values.
#    Since the correct predictions are 1.0 and the incorrect ones are 0.0, the mean of this tensor gives the proportion of correct predictions in the batch, which is the accuracy.

# Optimization process.
def run_optimization(x, y):
    with tf.GradientTape() as g:
        pred = logistic_regression(x)
        loss = loss_object(pred, y)
    gradients = g.gradient(loss, [weights, bias])
    optimizer.apply_gradients(zip(gradients, [weights, bias]))

In [None]:
# Initialize reporting variables
display_step = 10
epoch_values = []
accuracy_values = []
loss_values = []
loss = 0
diff = 1

# Training epochs
for i in range(numEpochs):
    #we are minimizing the loss using the optimizer. when its optimized, it will reach a certain point & starts to maximiize again. So we have to identify the right time to stop the optimization.
    #So what when the difference of loss become very narrow, we decide that now is the time to stop the optimization. when diff < 0.0001 is that moment.
    if i > 1 and diff < .0001:
        print("change in loss %g; convergence."%diff)
        break
    else:
        # Run training step
        run_optimization(trainX, trainY)

        # Report occasional stats
        if i % display_step == 0:
            # Add epoch to epoch_values
            epoch_values.append(i)

            pred = logistic_regression(testX)

            newLoss = loss_object(pred, testY)
            # Add loss to live graphing variable
            loss_values.append(newLoss)

            # Generate accuracy stats on test data
            acc = accuracy(pred, testY)
            accuracy_values.append(acc)


            # Re-assign values for variables
            diff = abs(newLoss - loss) #abs is a python function that gives absolute value of a number
            loss = newLoss

            #generate print statements
            print("step %d, training accuracy %g, loss %g, change in loss %g"%(i, acc, newLoss, diff))

# How well do we perform on held-out test data?
print("final accuracy on test set: %s" %acc.numpy())





  Initialization:
        display_step, epoch_values, accuracy_values, loss_values are initialized for tracking and reporting purposes.
        loss and diff are initialized for tracking the change in loss to check for convergence.

  Training Loop:
        The loop runs for a number of iterations specified by numEpochs.
        It checks if the change in loss (diff) is very small (less than 0.0001) after the first epoch, indicating that the model has potentially converged and the loop can be stopped early.

  Training Step:
        run_optimization(trainX, trainY) is called to perform a training step. This function updates the model's weights based on the training data.

  Reporting and Evaluation:
        Every display_step epochs, it performs evaluation and reporting:
            epoch_values.append(i) records the current epoch.
            pred = logistic_regression(testX) makes predictions on the test dataset.
            newLoss = loss_object(pred, testY) computes the loss on the test dataset.
            loss_values.append(newLoss) records the computed loss.
            acc = accuracy(pred, testY) calculates the accuracy on the test dataset.
            accuracy_values.append(acc) records the accuracy.

  Loss Difference Calculation:
        diff = abs(newLoss - loss) calculates the absolute difference in loss from the previous epoch.
        The loss from the current epoch is stored in loss for comparison in the next iteration.

  Logging:
        The script prints out the current epoch number, training accuracy, loss, and change in loss.

  Final Evaluation:
        After completing the training epochs, the final accuracy on the test set is printed.

In [None]:
%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
plt.xlabel("Epoch")
plt.ylabel("Loss")
plt.plot(loss_values)
plt.show()