<center>
<table>
  <tr>
    <td><img src="https://portal.nccs.nasa.gov/datashare/astg/training/python/logos/nasa-logo.svg" width="100"/> </td>
     <td><img src="https://portal.nccs.nasa.gov/datashare/astg/training/python/logos/ASTG_logo.png?raw=true" width="80"/> </td>
     <td> <img src="https://www.nccs.nasa.gov/sites/default/files/NCCS_Logo_0.png" width="130"/> </td>
    </tr>
</table>
</center>

        
<center>
<h1><font color= "blue" size="+3">ASTG Python Courses</font></h1>
</center>

---

<center>
    <h1><font color="red">Logistic Regression Classifier Model with Tensorflow</font></h1>
</center>

# <font color="red">Objectives</font>

In this presentation, we use a simple classification dataset to:

- Build a TensorFlow model
- Train the model
- Evaluate the model

We show the steps for building a Machine Learning (ML) model with PyTorch. The functions presented here can be used as reference for other ML applications.

## <font color="red"> Python packages used</font>

- __Matplotlib__: Create visualization.
- __Pandas__: Data (two-dimensional labelled array) manipulation and analysis.
- __Seaborn__: Provide a high-level interface for creating attractive and informative statistical graphics. 
- __Scikit-Learn__:  Provide supervised and unsupervised Machine Learning algorithms.
- __TensorFlow__: Used to to build, train, and evaluate a deep machine learning algorithm based on Neural Networks.

In [None]:
try:
    import google.colab
    print("Running in Google Colab")
except:
    print("Not running in Google Colab")
else:
    print("Installing modules in Google Colab")
    !pip install seaborn
    !pip install -U scikit-learn

In [None]:
import warnings
warnings.filterwarnings('ignore')

In [None]:
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
import pandas as pd
import seaborn as sns

In [None]:
from sklearn.model_selection import train_test_split
from sklearn import metrics

In [None]:
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
from tensorflow.keras.layers import Dense, Flatten, Conv2D
from tensorflow.keras import Model
from tensorflow.keras.utils import plot_model

In [None]:
print(f"Numpy version:      {np.__version__}")
print(f"Pandas version:     {pd.__version__}")
print(f"Seaborn version:    {sns.__version__}")
print(f"TensorFlow version: {tf.__version__}")

# <font color="red">Loading the dataset</font>

## <font color="blue">Description of the data</font>

- We have a dataset which features are points on a plane and the label has two values (classes).
   - Each point is assigned a class (`0` or `1`).
- We want to build a Machine Learning model to be able to predict the classes given a set of points.
- We will use __logistic regression__ that is a statistical method for predicting binary classes.
   - It is a special case of linear regression where the target variable is categorical in nature.
   - It is one of the most simple and commonly used Machine Learning algorithms for two-class classification.
   - The outcome or the target variable has only two possible classes.
   - It predicts the probability of occurrence of a binary event utilizing a logit function. 

## <font color="blue">Read the data</font>

In [None]:
url = "https://raw.githubusercontent.com/astg606/astg_pymaterials/refs/heads/main/ai_ml_tools/datasets/classifier_dataset.txt"

In [None]:
df = pd.read_csv(url, sep="\s+")
df

##  <font color="blue"> Splitting the data into training and testing sets</font>
- We split the data into training and testing sets. 
- We train the model with 70% of the samples and test with the remaining 30%. 

__Extract the train and test datasets as NumPy arrays__

In [None]:
X_train, X_test, y_train, y_test = train_test_split(df[["x1", "x2"]].values, 
                                                    df["label"].values, 
                                                    test_size=0.3, 
                                                    random_state=42)

In [None]:
X_train

In [None]:
X_train.shape

In [None]:
y_train

In [None]:
y_train.shape

In [None]:
np.bincount(y_train)

## <font color="blue">Visualize the data</font>

__Scatterplot of $x_1$ against $x_2$__

In [None]:
plt.scatter(X_train[:,0], X_train[:,1])
plt.xlabel(r"Feature $x_1$", fontsize=10)
plt.ylabel(r"Feature $x_2$", fontsize=10);

__Scatterplot with the two classes: `y=0` and `y=1`__

In [None]:
def plot_classes(X: np.array, y: np.array, boundary: tuple=None) -> None:

    plt.plot(X[y==0, 0], X[y==0, 1],
        marker="D", markersize=10,
        linestyle="", label="Class 0",
    )

    plt.plot(X[y==1, 0], X[y==1, 1],
        marker="^", markersize=13,
        linestyle="", label="Class 1",
    )

    if boundary:
        plt.plot([boundary[0], boundary[1]], [boundary[2], boundary[3]], color="red")
    plt.legend(loc='best')
    plt.xlim([-5.5, 5.5])
    plt.ylim([-5.5, 5.5])

    plt.xlabel(r"Feature $x_1$", fontsize=12)
    plt.ylabel(r"Feature $x_2$", fontsize=12)

    plt.grid()

In [None]:
plot_classes(X_train, y_train)

## <font color="blue">Normailized the Data</font> <a class="anchor" id="sec_tf_norm"></a>

- In general, variables may not be a similar scale. High values would gain more importance in any distance-based calculations. 
- It is good practice to normalize features that use different scales and ranges. 
- Although the model might converge without feature normalization, it makes training more difficult, and it makes the resulting model dependent on the choice of units used in the input.

In [None]:
X_train

In [None]:
train_mean = X_train.mean(axis=0)
train_std = X_train.std(axis=0)

In [None]:
train_mean

In [None]:
train_std

__Normalization of the train features__

In [None]:
X_train = (X_train - train_mean) / train_std

In [None]:
X_train

In [None]:
plot_classes(X_train, y_train)

__Normalization of the test features__

In [None]:
X_test = (X_test - train_mean) / train_std

# <font color="red">Creating the ML model</font>

## <font color="blue">Set the hyperparameters</font>

It is a good practice to declare the following parameters before creating the model for ease of change and understanding.

__Dataset parameters__

These parameters are defines by the dataset used:

- number of features
- number of classes to predict

In [None]:
input_size = 2
num_classes = 2

__Model parameters__

- batch size
- number of epochs
- learning rate (optimizer steps)

In [None]:
batch_size = 4
num_epochs = 20
learning_rate = 0.15

### <font color="green"> Convert class vectors to binary class matrices

- The targets have 2 possible integer values: `0` and ` 1`.
- We use the `to_categorical` function to convert integer targets into categorical:
  - `0` would become `[1, 0]` (it’s zero-indexed).
  - `1` would become `[0, 1]`.
- We do it because `Keras` will expect the training targets to be 2-dimensional vectors, since there will be 2 nodes in the output layer.

In [None]:
y_train_convert = tf.keras.utils.to_categorical(y_train, num_classes)
y_test_convert = tf.keras.utils.to_categorical(y_test, num_classes)

In [None]:
print(y_train_convert[0])

In [None]:
print(y_train_convert[1])

## <font color="blue">Build the TensorFlow model</font>

### <font color="green"> Instantiate a sequential model using `keras`

We create a sequential Neural Network:

- The model expects rows of data with `input_size` variables 
- The output layer has `num_classes` nodes.

Ther are no hidden layer or activation function.

In [None]:
model = tf.keras.Sequential([
    tf.keras.layers.Dense(num_classes, input_shape=(input_size,)),
])

### <font color="green"> Visualize the model's architecture

In [None]:
plot_model(model,
           #to_file='keras_model_plot.png',
           show_shapes=True,
           show_layer_names=True)

### <font color="green">  Inspect the model

`model.summary()` is a useful method if you want to get an overview of your model and see the total number of parameters.
It prints:

- Name and type of all layers in the model.
- Output shape for each layer.
- Number of weight parameters of each layer.
-  If the model has general topology, the inputs each layer receives
- The total number of trainable and non-trainable parameters of the model.

In [None]:
model.summary()

In [None]:
import tensorflow.keras.backend as K

trainable_count = np.sum([K.count_params(w) for w in model.trainable_weights])
non_trainable_count = np.sum([K.count_params(w) for w in model.non_trainable_weights])

print(f'        Total params: {trainable_count + non_trainable_count}')
print(f'    Trainable params: {trainable_count}')
print(f'Non-trainable params: {non_trainable_count}')

### <font color="green">  Compile the model
- Once you have specified the architecture of the network, you need to specify the method for back-propagation by choosing an optimizer and specify the loss.
- Compiling the model uses the efficient numerical libraries (Theano or TensorFlow) in the background.

Define the optimizer:

In [None]:
optimizer = tf.keras.optimizers.Adam(learning_rate=learning_rate)

Required to provide a loss function and an optimizer: 
- We are asking the network to use the `rmsprop` optimizer to change weights in such a way that the loss `mse` (mean squared error) is minimized at each iteration.

In [None]:
model.compile(loss = 'categorical_crossentropy',
              optimizer = optimizer,
              metrics = ['accuracy'])

## <font color="blue">Train the model</font>

- We use the `fit` method to train the model.

In [None]:
history = model.fit(
    X_train, y_train_convert,  
    batch_size = batch_size,
    epochs = num_epochs,
    verbose = 1, 
    validation_data = (X_test, y_test_convert)
)

### <font color="green">  Visualize the model's training progress

In [None]:
plt.plot(history.history['accuracy'], label='accuracy')
plt.plot(history.history['val_accuracy'], label='val_accuracy')
plt.plot(history.history['loss'], label='loss')
plt.plot(history.history['val_loss'], label='val_loss')
plt.legend()
plt.show()

## <font color="blue">Evaluate the Model on Test Data</font>

### <font color="green"> Compute the scores

In [None]:
loss, accuracy = model.evaluate(X_test, y_test_convert, verbose=0)

In [None]:
print(f"Test data:")
print(f"\t loss = {loss} \n\t accuracy = {accuracy}")

### <font color="green">  Make Prediction

In [None]:
probabilities = model.predict(X_test)

In [None]:
y_test_pred = np.argmax(probabilities, axis=1)

In [None]:
print('\t Model trainable parameters: \n')
print(f"Weights: \n {model.get_weights()[0]}")
print()
print(f"Biases:  \n {model.get_weights()[1]}")

In [None]:
def comp_boundary(model):

    w1 = model.get_weights()[0][0][0]
    w2 = model.get_weights()[0][0][1]
    b = model.get_weights()[1][0]

    print(f"w1 = {w1}")
    print(f"w2 = {w2}")
    print(f" b = {b}")
    print()

    x1_min = -20
    x2_min = (-(w1 * x1_min) - b) / w2

    x1_max = 20
    x2_max = (-(w1 * x1_max) - b) / w2

    return x1_min, x1_max, x2_min, x2_max

In [None]:
boundary = comp_boundary(model)
boundary

In [None]:
plot_classes(X_test, y_test_pred, boundary)

In [None]:
plot_classes(X_test, y_test, boundary)

# <font color="red">Useful references</font>

- <a href="https://www.mygreatlearning.com/blog/what-is-tensorflow-machine-learning-library-explained/">What is TensorFlow? The Machine Learning Library Explained</a>
- <a href="https://www.tensorflow.org/tutorials/keras/regression">Basic regression: Predict fuel efficiency</a>
- <a href="https://stackabuse.com/tensorflow-2-0-solving-classification-and-regression-problems/">Tensorflow 2.0: Solving Classification and Regression Problems</a>
- <a href="https://www.toptal.com/machine-learning/tensorflow-machine-learning-tutorial">Getting Started with TensorFlow: A Machine Learning Tutorial</a>
- <a href="https://sebastianraschka.com/faq/docs/tensorflow-vs-scikitlearn.html">What is the main difference between TensorFlow and scikit-learn?</a>
- <a href="https://adventuresinmachinelearning.com/python-tensorflow-tutorial/">Python TensorFlow Tutorial – Build a Neural Network</a>
- <a href="https://steadforce.com/en/first-steps-tensorflow-part-3/">A simple neural network with TensorFlow</a>