<a href="https://colab.research.google.com/github/Undasnr/DL-ML/blob/main/Ronny_TensorFlow_Assignment.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**1. Looking back on scratch**

List of core components needed to implement a deep learning model:
1. Initialization of Weights and Biases: You can't start without them. These are the parameters that the model will learn. You need to create tensors to hold these values and initialize them, often with small random numbers or zeros. This is a manual process without tf.keras.layers.

2. Forward Propagation: This is the core of the model. You must write the code to compute the output of the network given an input. This involves matrix multiplications (weights
times input) and adding biases, followed by the application of an activation function for each layer.

3. Loss Function: To train the model, you need a way to measure how wrong its predictions are. You must define a function that takes the model's output and the true labels and returns a single loss value. For binary classification, this would typically be binary cross-entropy.

3. Optimizer: This is the engine of the learning process. It's the algorithm that uses the loss to update the weights and biases. You have to implement an optimization algorithm, like Gradient Descent or Adam, which calculates the gradients of the loss with respect to the weights and then updates the weights in the opposite direction of the gradient.

4. Epoch and Batch Loops: Training is an iterative process. You need a loop that runs for a number of epochs. Inside this, you'll have another loop that processes the data in smaller chunks called batches. This is known as mini-batch gradient descent, which is more computationally efficient than processing the entire dataset at once.

5. Backpropagation: This is the process of calculating the gradients. In TensorFlow's low-level API, this is often handled by a tf.GradientTape. It's a key feature that automatically records operations to compute gradients, which are essential for the optimizer.

6. Accuracy Metric: During and after training, you need to evaluate the model's performance. You'll have to write a function to compare the predicted class with the true class and calculate a metric like accuracy.

7. Data Preprocessing and Splitting: Even at a low level, data must be prepared. This involves loading the data, splitting it into features and labels, and then splitting it into training and testing sets. You also need to handle normalization and one-hot encoding for the labels.



In [14]:
import pandas as pd
import tensorflow as tf
import numpy as np

# Loading the dataset
df = pd.read_csv('Iris.csv')

# Filtering for the two specified species: 'Iris-versicolor' and 'Iris-virginica'
df = df[(df['Species'] == 'Iris-versicolor') | (df['Species'] == 'Iris-virginica')]

# Mapping the species names to numerical labels (0 and 1)
df['Species'].replace({'Iris-versicolor': 0, 'Iris-virginica': 1}, inplace=True)

# Separating features (X) and labels (y)
X = df.iloc[:, 1:5].values
y = df['Species'].values

# Splitting the data into training and testing sets
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Converting to TensorFlow tensors
X_train_tensor = tf.constant(X_train, dtype=tf.float32)
y_train_tensor = tf.constant(y_train, dtype=tf.float32)
X_test_tensor = tf.constant(X_test, dtype=tf.float32)
y_test_tensor = tf.constant(y_test, dtype=tf.float32)

print("Training features shape:", X_train_tensor.shape)
print("Training labels shape:", y_train_tensor.shape)

Training features shape: (80, 4)
Training labels shape: (80,)


The behavior will change in pandas 3.0. This inplace method will never work because the intermediate object on which we are setting values always behaves as a copy.

For example, when doing 'df[col].method(value, inplace=True)', try using 'df.method({col: value}, inplace=True)' or df[col] = df[col].method(value) instead, to perform the operation inplace on the original object.


  df['Species'].replace({'Iris-versicolor': 0, 'Iris-virginica': 1}, inplace=True)
  df['Species'].replace({'Iris-versicolor': 0, 'Iris-virginica': 1}, inplace=True)


**2. Consider the correspondence between scratch and TensorFlow**

The provided sample code uses TensorFlow 1.x, which operates on a static computation graph. This is a very low-level approach, similar to what you would build from scratch, and it's a great way to understand the underlying mechanics. Let's break down how the scratch components you listed map to the TensorFlow 1.x code.

Initialization of Weights and Biases:

Scratch: You would manually create arrays or matrices to store weights and biases, and fill them with random numbers.

TensorFlow: This is handled by tf.Variable. The code tf.Variable(tf.random_normal([n_input, n_hidden1])) creates a TensorFlow variable and initializes it with random values. These variables are part of the computation graph and their values can be modified during training. tf.global_variables_initializer() is a separate operation that runs the initializers for all tf.Variable objects.

Forward Propagation:

Scratch: You would write a series of matrix multiplications and additions using a library like NumPy, followed by applying activation functions.

TensorFlow: The entire model is defined as a function (example_net). The operations like tf.matmul (matrix multiplication), tf.add, and tf.nn.relu (ReLU activation) define the static computation graph. This graph is not executed until a session is run.

Loss Function:

Scratch: You'd implement the loss calculation (e.g., binary cross-entropy) as a custom function.

TensorFlow: This is done with tf.nn.sigmoid_cross_entropy_with_logits and tf.reduce_mean. The tf.nn module contains many pre-built loss functions, which are more efficient and stable than a manual implementation. The tf.reduce_mean operation calculates the average loss over the batch.

Optimizer:

Scratch: You would manually calculate gradients and update weights using a small learning rate.

TensorFlow: This is abstracted by tf.train.AdamOptimizer. You create an optimizer instance and then call its minimize method, which automatically computes the gradients (tf.gradients under the hood) and updates the variables to reduce the loss. This is a significant abstraction from the scratch implementation.

Epoch and Batch Loops:

Scratch: You would use standard Python for loops to iterate through epochs and mini-batches.

TensorFlow: The code still uses a standard Python for loop for epochs. However, a custom GetMiniBatch iterator class is written to manage the batching process, shuffling the data, and providing mini-batches. This part is a manual, "scratch-like" implementation because it's a data-handling task, not a core TensorFlow operation.

Backpropagation:

Scratch: This is the most complex part to implement manually, involving the chain rule to compute gradients.

TensorFlow: It's fully automated by the optimizer. When you call optimizer.minimize(loss_op), TensorFlow automatically calculates the gradients of the loss_op with respect to all the variables (tf.Variable) in the graph. The use of a static graph makes this process highly efficient.

Placeholders:

Scratch: You would just use Python variables to hold your input data.

TensorFlow: tf.placeholder is used to create a "hole" in the computation graph where you can feed data. The data is passed into the graph at runtime using the feed_dict parameter of sess.run().

The core difference is that in the low-level TensorFlow 1.x approach, you first build a static graph of all the computations and then use a tf.Session to execute it, feeding in data as needed. This is different from the eager execution of TensorFlow 2.x and a scratch-like approach, where operations are performed immediately.

---
Application to Other Datasets

*** Adapting code for two datasets

This requires changing the data loading, preprocessing, and model architecture (especially the output layer) to suit the new problems.

A. Iris Dataset (All Three Classes)

This is a multi-class classification problem. The key changes are:

The labels must be one-hot encoded.

The final layer of the neural network must have 3 neurons (one for each class).

The loss function must be softmax_cross_entropy_with_logits.

The correct_pred logic must be changed for multi-class.
```python
import numpy as np
import pandas as pd
import tensorflow as tf
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import OneHotEncoder

# Load dataset
df=pd.read_csv("Iris.csv")
y=df["Species"]
X=df.loc[:, ["SepalLengthCm", "SepalWidthCm", "PetalLengthCm", "PetalWidthCm"]]
X=np.array(X)
y=np.array(y).reshape(-1, 1) # Reshape for one-hot encoding

# One-hot encode labels
ohe = OneHotEncoder(handle_unknown='ignore', sparse_output=False)
y = ohe.fit_transform(y)

# Split data
X_train, X_test, y_train, y_test=train_test_split(X, y, test_size=0.2, random_state=0)
X_train, X_val, y_train, y_val=train_test_split(X_train, y_train, test_size=0.2, random_state=0)

# Hyperparameters and placeholders
n_classes = 3 # Now 3 classes
n_input = X_train.shape[1]
X_tf = tf.placeholder("float", [None, n_input])
Y_tf = tf.placeholder("float", [None, n_classes])

# (The rest of the model definition and training loop is similar, with these changes)
# Weights and biases now have n_classes
# The final layer: tf.Variable(tf.random_normal([n_hidden2, n_classes]))
# The loss function: tf.nn.softmax_cross_entropy_with_logits(labels=Y_tf, logits=logits)
# The accuracy metric: tf.equal(tf.argmax(logits, 1), tf.argmax(Y_tf, 1))
```
B. House Prices Dataset

This is a regression problem, not classification. We are predicting a continuous value (price) instead of a discrete class. The core changes are:

The output layer has only one neuron.

There is no activation function on the final layer.

The loss function is a regression loss, like Mean Squared Error (MSE).

The labels should not be one-hot encoded.
```python
import numpy as np
import pandas as pd
import tensorflow as tf
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler

# Load dataset
df = pd.read_csv("House_Prices.csv") # Assuming a House_Prices.csv with features and a "Price" column
y = df["Price"]
X = df.drop("Price", axis=1)
X = np.array(X)
y = np.array(y).reshape(-1, 1)

# Scale the data for regression
scaler_X = StandardScaler()
scaler_y = StandardScaler()
X = scaler_X.fit_transform(X)
y = scaler_y.fit_transform(y)

# Split data
X_train, X_test, y_train, y_test=train_test_split(X, y, test_size=0.2, random_state=0)
X_train, X_val, y_train, y_val=train_test_split(X_train, y_train, test_size=0.2, random_state=0)

# Hyperparameters and placeholders
n_classes = 1 # Only one output for price
n_input = X_train.shape[1]
X_tf = tf.placeholder("float", [None, n_input])
Y_tf = tf.placeholder("float", [None, n_classes])

# (Model definition and training loop changes)
# The final layer: tf.Variable(tf.random_normal([n_hidden2, n_classes]))
# NO activation function on the final layer
# The loss function: tf.losses.mean_squared_error(labels=Y_tf, predictions=logits)
# No accuracy metric is used. Instead, we would report MSE or RMSE.
```

In [18]:
"""
Binary classification of Iris dataset using a neural network implemented in TensorFlow 2.x.
This model is configured to classify only 'Iris-versicolor' and 'Iris-virginica'.
"""
import numpy as np
import pandas as pd
import tensorflow as tf
from sklearn.model_selection import train_test_split

# 1. Data Preparation
# Load the dataset
df = pd.read_csv("Iris.csv")
# Filter for the two specified species
df = df[(df["Species"] == "Iris-versicolor") | (df["Species"] == "Iris-virginica")]
y = df["Species"]
X = df.loc[:, ["SepalLengthCm", "SepalWidthCm", "PetalLengthCm", "PetalWidthCm"]]
X = np.array(X, dtype=np.float32) # Ensure X is float32
y = np.array(y)

# Convert labels to numbers (0 and 1)
y[y == "Iris-versicolor"] = 0
y[y == "Iris-virginica"] = 1
y = y.astype(np.float32)[:, np.newaxis]

# Split into train, val, and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)
X_train, X_val, y_train, y_val = train_test_split(X_train, y_train, test_size=0.2, random_state=0)

# Define a simple mini-batch iterator
class GetMiniBatch:
    def __init__(self, X, y, batch_size=10, seed=0):
        self.batch_size = batch_size
        np.random.seed(seed)
        shuffle_index = np.random.permutation(np.arange(X.shape[0]))
        self.X = X[shuffle_index]
        self.y = y[shuffle_index]
        self._stop = np.ceil(X.shape[0] / self.batch_size).astype(np.int64)

    def __len__(self):
        return self._stop

    def __getitem__(self, item):
        p0 = item * self.batch_size
        p1 = item * self.batch_size + self.batch_size
        return self.X[p0:p1], self.y[p0:p1]

    def __iter__(self):
        self._counter = 0
        return self

    def __next__(self):
        if self._counter >= self._stop:
            raise StopIteration()
        p0 = self._counter * self.batch_size
        p1 = self._counter * self.batch_size + self.batch_size
        self._counter += 1
        return self.X[p0:p1], self.y[p0:p1]

# 2. Hyperparameter and Model Setup
learning_rate = 0.001
batch_size = 10
num_epochs = 100
n_hidden1 = 50
n_hidden2 = 100
n_input = X_train.shape[1]
n_classes = 1

# Define weights and biases as tf.Variable for TensorFlow 2.x
tf.random.set_seed(0)
weights = {
    'w1': tf.Variable(tf.random.normal([n_input, n_hidden1], dtype=tf.float32)),
    'w2': tf.Variable(tf.random.normal([n_hidden1, n_hidden2], dtype=tf.float32)),
    'w3': tf.Variable(tf.random.normal([n_hidden2, n_classes], dtype=tf.float32))
}
biases = {
    'b1': tf.Variable(tf.random.normal([n_hidden1], dtype=tf.float32)),
    'b2': tf.Variable(tf.random.normal([n_hidden2], dtype=tf.float32)),
    'b3': tf.Variable(tf.random.normal([n_classes], dtype=tf.float32))
}

# Define the Adam optimizer
optimizer = tf.optimizers.Adam(learning_rate=learning_rate)

# --- 3. Model Architecture (Forward Pass) ---
def example_net(x):
    """Simple 3-layer neural network for binary classification."""
    layer_1 = tf.add(tf.matmul(x, weights['w1']), biases['b1'])
    layer_1 = tf.nn.relu(layer_1)
    layer_2 = tf.add(tf.matmul(layer_1, weights['w2']), biases['b2'])
    layer_2 = tf.nn.relu(layer_2)
    layer_output = tf.matmul(layer_2, weights['w3']) + biases['b3']
    return layer_output

# 4. Training and Evaluation Loop
# This loop now uses Eager Execution and tf.GradientTape for backpropagation.
get_mini_batch_train = GetMiniBatch(X_train, y_train, batch_size=batch_size)
for epoch in range(num_epochs):
    total_loss = 0
    total_acc = 0
    for mini_batch_x, mini_batch_y in get_mini_batch_train:
        # Use tf.GradientTape to record operations for automatic differentiation
        with tf.GradientTape() as tape:
            logits = example_net(mini_batch_x)
            loss = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(labels=mini_batch_y, logits=logits))

        # Calculate gradients and apply them
        gradients = tape.gradient(loss, [weights['w1'], weights['w2'], weights['w3'],
                                         biases['b1'], biases['b2'], biases['b3']])
        optimizer.apply_gradients(zip(gradients, [weights['w1'], weights['w2'], weights['w3'],
                                                  biases['b1'], biases['b2'], biases['b3']]))

        # Calculate accuracy for the mini-batch
        correct_pred = tf.equal(tf.sign(mini_batch_y - 0.5), tf.sign(tf.sigmoid(logits) - 0.5))
        acc = tf.reduce_mean(tf.cast(correct_pred, tf.float32))

        total_loss += loss.numpy()
        total_acc += acc.numpy()

    # Calculate average loss and accuracy for the epoch
    total_batch = len(get_mini_batch_train)
    total_loss /= total_batch
    total_acc /= total_batch

    # Validation
    val_logits = example_net(X_val)
    val_loss = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(labels=y_val, logits=val_logits))
    val_correct_pred = tf.equal(tf.sign(y_val - 0.5), tf.sign(tf.sigmoid(val_logits) - 0.5))
    val_acc = tf.reduce_mean(tf.cast(val_correct_pred, tf.float32))

    print(f"Epoch {epoch+1:03d}, Loss: {total_loss:.4f}, Val Loss: {val_loss.numpy():.4f}, Accuracy: {total_acc:.3f}, Val Acc: {val_acc.numpy():.3f}")

# Test
test_logits = example_net(X_test)
test_correct_pred = tf.equal(tf.sign(y_test - 0.5), tf.sign(tf.sigmoid(test_logits) - 0.5))
test_acc = tf.reduce_mean(tf.cast(test_correct_pred, tf.float32))
print(f"\nTest Accuracy: {test_acc.numpy():.3f}")

Epoch 001, Loss: 41.3720, Val Loss: 36.8386, Accuracy: 0.550, Val Acc: 0.375
Epoch 002, Loss: 21.8496, Val Loss: 11.9242, Accuracy: 0.550, Val Acc: 0.438
Epoch 003, Loss: 7.5692, Val Loss: 3.9683, Accuracy: 0.486, Val Acc: 0.625
Epoch 004, Loss: 7.0586, Val Loss: 2.4231, Accuracy: 0.521, Val Acc: 0.750
Epoch 005, Loss: 3.1090, Val Loss: 1.1669, Accuracy: 0.621, Val Acc: 0.750
Epoch 006, Loss: 2.6151, Val Loss: 0.5328, Accuracy: 0.736, Val Acc: 0.750
Epoch 007, Loss: 1.2863, Val Loss: 0.4620, Accuracy: 0.764, Val Acc: 0.938
Epoch 008, Loss: 1.4483, Val Loss: 0.4094, Accuracy: 0.814, Val Acc: 0.938
Epoch 009, Loss: 1.3068, Val Loss: 0.0442, Accuracy: 0.821, Val Acc: 1.000
Epoch 010, Loss: 1.1049, Val Loss: 0.0091, Accuracy: 0.821, Val Acc: 1.000
Epoch 011, Loss: 0.9076, Val Loss: 0.0232, Accuracy: 0.886, Val Acc: 1.000
Epoch 012, Loss: 0.8890, Val Loss: 0.0103, Accuracy: 0.871, Val Acc: 1.000
Epoch 013, Loss: 0.8337, Val Loss: 0.0017, Accuracy: 0.871, Val Acc: 1.000
Epoch 014, Loss: 0.75

**3. Create a model of Iris using all three types of objective variables**

In [20]:
"""
Multi-class classification of Iris dataset using a neural network implemented in TensorFlow 2.x.
This model is configured to classify all three species of Iris.
"""
import numpy as np
import pandas as pd
import tensorflow as tf
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import OneHotEncoder

# 1. Data Preparation
# Load the dataset
df = pd.read_csv("Iris.csv")
y = df["Species"]
X = df.loc[:, ["SepalLengthCm", "SepalWidthCm", "PetalLengthCm", "PetalWidthCm"]]
X = np.array(X, dtype=np.float32) # Ensure X is float32
y = np.array(y).reshape(-1, 1) # Reshape for one-hot encoding

# One-hot encode the labels for multi-class classification
ohe = OneHotEncoder(handle_unknown='ignore', sparse_output=False)
y = ohe.fit_transform(y)
y = y.astype(np.float32)

# Split into train, val, and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)
X_train, X_val, y_train, y_val = train_test_split(X_train, y_train, test_size=0.2, random_state=0)

# Define a simple mini-batch iterator
class GetMiniBatch:
    def __init__(self, X, y, batch_size=10, seed=0):
        self.batch_size = batch_size
        np.random.seed(seed)
        shuffle_index = np.random.permutation(np.arange(X.shape[0]))
        self.X = X[shuffle_index]
        self.y = y[shuffle_index]
        self._stop = np.ceil(X.shape[0] / self.batch_size).astype(np.int64)

    def __len__(self):
        return self._stop

    def __getitem__(self, item):
        p0 = item * self.batch_size
        p1 = item * self.batch_size + self.batch_size
        return self.X[p0:p1], self.y[p0:p1]

    def __iter__(self):
        self._counter = 0
        return self

    def __next__(self):
        if self._counter >= self._stop:
            raise StopIteration()
        p0 = self._counter * self.batch_size
        p1 = self._counter * self.batch_size + self.batch_size
        self._counter += 1
        return self.X[p0:p1], self.y[p0:p1]

# 2. Hyperparameter and Model Setup
learning_rate = 0.001
batch_size = 10
num_epochs = 100
n_hidden1 = 50
n_hidden2 = 100
n_input = X_train.shape[1]
n_classes = 3 # NOW 3 CLASSES FOR THE 3 SPECIES

# Define weights and biases as tf.Variable for TensorFlow 2.x
tf.random.set_seed(0)
weights = {
    'w1': tf.Variable(tf.random.normal([n_input, n_hidden1], dtype=tf.float32)),
    'w2': tf.Variable(tf.random.normal([n_hidden1, n_hidden2], dtype=tf.float32)),
    'w3': tf.Variable(tf.random.normal([n_hidden2, n_classes], dtype=tf.float32))
}
biases = {
    'b1': tf.Variable(tf.random.normal([n_hidden1], dtype=tf.float32)),
    'b2': tf.Variable(tf.random.normal([n_hidden2], dtype=tf.float32)),
    'b3': tf.Variable(tf.random.normal([n_classes], dtype=tf.float32))
}

# Define the Adam optimizer
optimizer = tf.optimizers.Adam(learning_rate=learning_rate)

# 3. Model Architecture (Forward Pass)
def example_net(x):
    """Simple 3-layer neural network for multi-class classification."""
    layer_1 = tf.add(tf.matmul(x, weights['w1']), biases['b1'])
    layer_1 = tf.nn.relu(layer_1)
    layer_2 = tf.add(tf.matmul(layer_1, weights['w2']), biases['b2'])
    layer_2 = tf.nn.relu(layer_2)
    layer_output = tf.matmul(layer_2, weights['w3']) + biases['b3']
    return layer_output

# 4. Training and Evaluation Loop
get_mini_batch_train = GetMiniBatch(X_train, y_train, batch_size=batch_size)
for epoch in range(num_epochs):
    total_loss = 0
    total_acc = 0
    for mini_batch_x, mini_batch_y in get_mini_batch_train:
        # Use tf.GradientTape to record operations for automatic differentiation
        with tf.GradientTape() as tape:
            logits = example_net(mini_batch_x)
            # Use softmax cross-entropy for multi-class classification
            loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=mini_batch_y, logits=logits))

        # Calculate gradients and apply them
        gradients = tape.gradient(loss, [weights['w1'], weights['w2'], weights['w3'],
                                         biases['b1'], biases['b2'], biases['b3']])
        optimizer.apply_gradients(zip(gradients, [weights['w1'], weights['w2'], weights['w3'],
                                                  biases['b1'], biases['b2'], biases['b3']]))

        # Calculate accuracy for the mini-batch
        correct_pred = tf.equal(tf.argmax(logits, 1), tf.argmax(mini_batch_y, 1))
        acc = tf.reduce_mean(tf.cast(correct_pred, tf.float32))

        total_loss += loss.numpy()
        total_acc += acc.numpy()

    # Calculate average loss and accuracy for the epoch
    total_batch = len(get_mini_batch_train)
    total_loss /= total_batch
    total_acc /= total_batch

    # Validation
    val_logits = example_net(X_val)
    val_loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=y_val, logits=val_logits))
    val_correct_pred = tf.equal(tf.argmax(val_logits, 1), tf.argmax(y_val, 1))
    val_acc = tf.reduce_mean(tf.cast(val_correct_pred, tf.float32))

    print(f"Epoch {epoch+1:03d}, Loss: {total_loss:.4f}, Val Loss: {val_loss.numpy():.4f}, Accuracy: {total_acc:.3f}, Val Acc: {val_acc.numpy():.3f}")

# Test
test_logits = example_net(X_test)
test_correct_pred = tf.equal(tf.argmax(test_logits, 1), tf.argmax(y_test, 1))
test_acc = tf.reduce_mean(tf.cast(test_correct_pred, tf.float32))
print(f"\nTest Accuracy: {test_acc.numpy():.3f}")

Epoch 001, Loss: 76.5162, Val Loss: 54.4866, Accuracy: 0.000, Val Acc: 0.000
Epoch 002, Loss: 33.5534, Val Loss: 29.0514, Accuracy: 0.133, Val Acc: 0.333
Epoch 003, Loss: 15.0872, Val Loss: 19.6846, Accuracy: 0.497, Val Acc: 0.375
Epoch 004, Loss: 4.0110, Val Loss: 10.8383, Accuracy: 0.687, Val Acc: 0.750
Epoch 005, Loss: 1.1285, Val Loss: 5.0824, Accuracy: 0.890, Val Acc: 0.750
Epoch 006, Loss: 1.5111, Val Loss: 4.7168, Accuracy: 0.930, Val Acc: 0.750
Epoch 007, Loss: 1.2311, Val Loss: 6.0103, Accuracy: 0.950, Val Acc: 0.792
Epoch 008, Loss: 1.0240, Val Loss: 7.0371, Accuracy: 0.930, Val Acc: 0.792
Epoch 009, Loss: 0.8997, Val Loss: 5.9861, Accuracy: 0.940, Val Acc: 0.792
Epoch 010, Loss: 0.8538, Val Loss: 5.7975, Accuracy: 0.950, Val Acc: 0.833
Epoch 011, Loss: 0.7821, Val Loss: 5.9163, Accuracy: 0.960, Val Acc: 0.833
Epoch 012, Loss: 0.7058, Val Loss: 5.5962, Accuracy: 0.960, Val Acc: 0.833
Epoch 013, Loss: 0.6415, Val Loss: 5.3032, Accuracy: 0.960, Val Acc: 0.833
Epoch 014, Loss: 0

**4. Create a model of House Prices**

In [21]:
"""
Regression model for House Prices dataset using a neural network implemented in TensorFlow 2.x.
This model predicts the 'SalePrice' based on 'GrLivArea' and 'YearBuilt'.
"""
import numpy as np
import pandas as pd
import tensorflow as tf
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler

# 1. Data Preparation
# Load the dataset (assuming 'House_Prices.csv' is available)
df = pd.read_csv("House_Prices.csv")

# Select features (X) and label (y)
X = df[['GrLivArea', 'YearBuilt']]
y = df['SalePrice']

# Convert to NumPy arrays for easier processing
X = np.array(X, dtype=np.float32)
y = np.array(y, dtype=np.float32).reshape(-1, 1)

# Standardize the data. This is crucial for regression models.
scaler_X = StandardScaler()
X = scaler_X.fit_transform(X)
scaler_y = StandardScaler()
y = scaler_y.fit_transform(y)

# Split into train, val, and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)
X_train, X_val, y_train, y_val = train_test_split(X_train, y_train, test_size=0.2, random_state=0)

# Define a simple mini-batch iterator
class GetMiniBatch:
    def __init__(self, X, y, batch_size=10, seed=0):
        self.batch_size = batch_size
        np.random.seed(seed)
        shuffle_index = np.random.permutation(np.arange(X.shape[0]))
        self.X = X[shuffle_index]
        self.y = y[shuffle_index]
        self._stop = np.ceil(X.shape[0] / self.batch_size).astype(np.int64)

    def __len__(self):
        return self._stop

    def __getitem__(self, item):
        p0 = item * self.batch_size
        p1 = item * self.batch_size + self.batch_size
        return self.X[p0:p1], self.y[p0:p1]

    def __iter__(self):
        self._counter = 0
        return self

    def __next__(self):
        if self._counter >= self._stop:
            raise StopIteration()
        p0 = self._counter * self.batch_size
        p1 = self._counter * self.batch_size + self.batch_size
        self._counter += 1
        return self.X[p0:p1], self.y[p0:p1]

# 2. Hyperparameter and Model Setup
learning_rate = 0.001
batch_size = 10
num_epochs = 100
n_hidden1 = 50
n_hidden2 = 100
n_input = X_train.shape[1]
n_output = 1  # For regression, the output is a single continuous value

# Define weights and biases as tf.Variable for TensorFlow 2.x
tf.random.set_seed(0)
weights = {
    'w1': tf.Variable(tf.random.normal([n_input, n_hidden1], dtype=tf.float32)),
    'w2': tf.Variable(tf.random.normal([n_hidden1, n_hidden2], dtype=tf.float32)),
    'w3': tf.Variable(tf.random.normal([n_hidden2, n_output], dtype=tf.float32))
}
biases = {
    'b1': tf.Variable(tf.random.normal([n_hidden1], dtype=tf.float32)),
    'b2': tf.Variable(tf.random.normal([n_hidden2], dtype=tf.float32)),
    'b3': tf.Variable(tf.random.normal([n_output], dtype=tf.float32))
}

# Define the Adam optimizer and Mean Squared Error loss
optimizer = tf.optimizers.Adam(learning_rate=learning_rate)
mse_loss = tf.keras.losses.MeanSquaredError()

# 3. Model Architecture (Forward Pass)
def example_net(x):
    """Simple 3-layer neural network for regression."""
    layer_1 = tf.add(tf.matmul(x, weights['w1']), biases['b1'])
    layer_1 = tf.nn.relu(layer_1)
    layer_2 = tf.add(tf.matmul(layer_1, weights['w2']), biases['b2'])
    layer_2 = tf.nn.relu(layer_2)
    # The final layer produces a single output without an activation function
    layer_output = tf.matmul(layer_2, weights['w3']) + biases['b3']
    return layer_output

# 4. Training and Evaluation Loop
get_mini_batch_train = GetMiniBatch(X_train, y_train, batch_size=batch_size)
for epoch in range(num_epochs):
    total_loss = 0
    for mini_batch_x, mini_batch_y in get_mini_batch_train:
        # Use tf.GradientTape to record operations for automatic differentiation
        with tf.GradientTape() as tape:
            y_pred = example_net(mini_batch_x)
            loss = mse_loss(mini_batch_y, y_pred)

        # Calculate gradients and apply them
        gradients = tape.gradient(loss, [weights['w1'], weights['w2'], weights['w3'],
                                         biases['b1'], biases['b2'], biases['b3']])
        optimizer.apply_gradients(zip(gradients, [weights['w1'], weights['w2'], weights['w3'],
                                                  biases['b1'], biases['b2'], biases['b3']]))

        total_loss += loss.numpy()

    # Calculate average loss for the epoch
    total_batch = len(get_mini_batch_train)
    total_loss /= total_batch

    # Validation
    val_pred = example_net(X_val)
    val_loss = mse_loss(y_val, val_pred)

    print(f"Epoch {epoch+1:03d}, Loss: {total_loss:.4f}, Val Loss: {val_loss.numpy():.4f}")

# Test
test_pred = example_net(X_test)
test_loss = mse_loss(y_test, test_pred)
print(f"\nTest MSE: {test_loss.numpy():.4f}")

Epoch 001, Loss: 404.3761, Val Loss: 65.7382
Epoch 002, Loss: 48.3589, Val Loss: 32.1602
Epoch 003, Loss: 26.6619, Val Loss: 19.8194
Epoch 004, Loss: 18.4097, Val Loss: 13.3238
Epoch 005, Loss: 13.2609, Val Loss: 10.0146
Epoch 006, Loss: 9.9662, Val Loss: 7.9801
Epoch 007, Loss: 7.7832, Val Loss: 6.5586
Epoch 008, Loss: 6.2552, Val Loss: 5.5688
Epoch 009, Loss: 5.1503, Val Loss: 4.8929
Epoch 010, Loss: 4.3165, Val Loss: 4.4184
Epoch 011, Loss: 3.6568, Val Loss: 4.0572
Epoch 012, Loss: 3.1292, Val Loss: 3.7596
Epoch 013, Loss: 2.6997, Val Loss: 3.4877
Epoch 014, Loss: 2.3475, Val Loss: 3.2716
Epoch 015, Loss: 2.0751, Val Loss: 3.0659
Epoch 016, Loss: 1.8461, Val Loss: 2.8675
Epoch 017, Loss: 1.6545, Val Loss: 2.6809
Epoch 018, Loss: 1.4981, Val Loss: 2.5095
Epoch 019, Loss: 1.3736, Val Loss: 2.3540
Epoch 020, Loss: 1.2655, Val Loss: 2.2241
Epoch 021, Loss: 1.1730, Val Loss: 2.1083
Epoch 022, Loss: 1.0922, Val Loss: 2.0069
Epoch 023, Loss: 1.0214, Val Loss: 1.9107
Epoch 024, Loss: 0.9617

**Adapting the Model for Regression**

Switching from a classification problem (like the Iris dataset) to a regression problem (like predicting house prices) requires fundamental changes to the neural network's architecture and training process. The goal is no longer to assign a category but to predict a continuous, numerical value.

**Here are the key differences I've implemented in the updated code:**

**Data Preparation:** I've modified the data loading to use GrLivArea and YearBuilt as explanatory variables and SalePrice as the objective variable. To ensure the model trains efficiently, all data (both features and the target SalePrice) is now standardized using a StandardScaler.

**Output Layer: **The final layer of the neural network has been changed from 3 neurons to 1 neuron. A regression model only needs a single output neuron to predict the single continuous value.

**Loss Function:** The multi-class softmax_cross_entropy_with_logits has been replaced with mean_squared_error (tf.keras.losses.MeanSquaredError). This is the most common loss function for regression, as it measures the average squared difference between the predicted and actual values.

**Evaluation:** The accuracy metric has been removed, as it is only relevant for classification. Instead, the training and validation progress are now tracked using the Mean Squared Error (MSE) loss itself.



**5. Create a model of MNIST**

In [23]:
"""
Multi-class classification of the MNIST dataset using a simple neural network implemented in TensorFlow 2.x.
This model is configured to classify all 10 digits (0-9).
"""
import numpy as np
import pandas as pd
import tensorflow as tf
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler

# 0. GPU Configuration Check
# Check for GPU device availability.
# tf.config.list_physical_devices() is the modern TensorFlow 2.x method.
# tf.test.gpu_device_name() is an older TensorFlow 1.x method,
# and it may not work as expected in a TensorFlow 2.x environment.
# The following code is for checking GPU availability in TensorFlow 2.x.
print("Num GPUs Available: ", len(tf.config.list_physical_devices('GPU')))
if tf.test.gpu_device_name():
    print('Default GPU Device: {}'.format(tf.test.gpu_device_name()))
else:
    print("Please install GPU version of TF")
    print("Note: '!pip install tensorflow-gpu==1.14.0' is for TensorFlow 1.x.")
    print("This code is written for TensorFlow 2.x, and the GPU is automatically utilized if available.")

# 1. Data Preparation
# Load the MNIST dataset directly from TensorFlow
(X_train_raw, y_train_raw), (X_test_raw, y_test_raw) = tf.keras.datasets.mnist.load_data()

# Normalize pixel values to be between 0 and 1
X_train = X_train_raw.astype(np.float32) / 255.0
X_test = X_test_raw.astype(np.float32) / 255.0

# Flatten the 28x28 images into a 784-dimensional vector
X_train = X_train.reshape(-1, 28 * 28)
X_test = X_test.reshape(-1, 28 * 28)

# One-hot encode the labels. For example, the label '5' becomes a vector [0,0,0,0,0,1,0,0,0,0]
y_train = tf.keras.utils.to_categorical(y_train_raw, 10)
y_test = tf.keras.utils.to_categorical(y_test_raw, 10)

# Split the training data into training and validation sets
X_train, X_val, y_train, y_val = train_test_split(X_train, y_train, test_size=0.2, random_state=0)

# Define a simple mini-batch iterator
class GetMiniBatch:
    def __init__(self, X, y, batch_size=10, seed=0):
        self.batch_size = batch_size
        np.random.seed(seed)
        shuffle_index = np.random.permutation(np.arange(X.shape[0]))
        self.X = X[shuffle_index]
        self.y = y[shuffle_index]
        self._stop = np.ceil(X.shape[0] / self.batch_size).astype(np.int64)

    def __len__(self):
        return self._stop

    def __getitem__(self, item):
        p0 = item * self.batch_size
        p1 = item * self.batch_size + self.batch_size
        return self.X[p0:p1], self.y[p0:p1]

    def __iter__(self):
        self._counter = 0
        return self

    def __next__(self):
        if self._counter >= self._stop:
            raise StopIteration()
        p0 = self._counter * self.batch_size
        p1 = self._counter * self.batch_size + self.batch_size
        self._counter += 1
        return self.X[p0:p1], self.y[p0:p1]

# 2. Hyperparameter and Model Setup
learning_rate = 0.001
batch_size = 128
num_epochs = 20
n_hidden1 = 128
n_hidden2 = 64
n_input = 28 * 28 # Flattened image size
n_classes = 10    # 10 digits from 0 to 9

# Define weights and biases as tf.Variable for TensorFlow 2.x
tf.random.set_seed(0)
weights = {
    'w1': tf.Variable(tf.random.normal([n_input, n_hidden1], dtype=tf.float32)),
    'w2': tf.Variable(tf.random.normal([n_hidden1, n_hidden2], dtype=tf.float32)),
    'w3': tf.Variable(tf.random.normal([n_hidden2, n_classes], dtype=tf.float32))
}
biases = {
    'b1': tf.Variable(tf.random.normal([n_hidden1], dtype=tf.float32)),
    'b2': tf.Variable(tf.random.normal([n_hidden2], dtype=tf.float32)),
    'b3': tf.Variable(tf.random.normal([n_classes], dtype=tf.float32))
}

# Define the Adam optimizer and Categorical Crossentropy loss
optimizer = tf.optimizers.Adam(learning_rate=learning_rate)
loss_fn = tf.keras.losses.CategoricalCrossentropy(from_logits=True)

# 3. Model Architecture (Forward Pass)
def mnist_net(x):
    """Simple 3-layer neural network for MNIST classification."""
    layer_1 = tf.add(tf.matmul(x, weights['w1']), biases['b1'])
    layer_1 = tf.nn.relu(layer_1)
    layer_2 = tf.add(tf.matmul(layer_1, weights['w2']), biases['b2'])
    layer_2 = tf.nn.relu(layer_2)
    layer_output = tf.matmul(layer_2, weights['w3']) + biases['b3']
    return layer_output

# 4. Training and Evaluation Loop
get_mini_batch_train = GetMiniBatch(X_train, y_train, batch_size=batch_size)
for epoch in range(num_epochs):
    total_loss = 0
    total_acc = 0
    for mini_batch_x, mini_batch_y in get_mini_batch_train:
        # Use tf.GradientTape to record operations for automatic differentiation
        with tf.GradientTape() as tape:
            logits = mnist_net(mini_batch_x)
            loss = loss_fn(mini_batch_y, logits)

        # Calculate gradients and apply them
        gradients = tape.gradient(loss, [weights['w1'], weights['w2'], weights['w3'],
                                         biases['b1'], biases['b2'], biases['b3']])
        optimizer.apply_gradients(zip(gradients, [weights['w1'], weights['w2'], weights['w3'],
                                                  biases['b1'], biases['b2'], biases['b3']]))

        # Calculate accuracy for the mini-batch
        correct_pred = tf.equal(tf.argmax(logits, 1), tf.argmax(mini_batch_y, 1))
        acc = tf.reduce_mean(tf.cast(correct_pred, tf.float32))

        total_loss += loss.numpy()
        total_acc += acc.numpy()

    # Calculate average loss and accuracy for the epoch
    total_batch = len(get_mini_batch_train)
    total_loss /= total_batch
    total_acc /= total_batch

    # Validation
    val_logits = mnist_net(X_val)
    val_loss = loss_fn(y_val, val_logits)
    val_correct_pred = tf.equal(tf.argmax(val_logits, 1), tf.argmax(y_val, 1))
    val_acc = tf.reduce_mean(tf.cast(val_correct_pred, tf.float32))

    print(f"Epoch {epoch+1:03d}, Loss: {total_loss:.4f}, Val Loss: {val_loss.numpy():.4f}, Accuracy: {total_acc:.3f}, Val Acc: {val_acc.numpy():.3f}")

# Test
test_logits = mnist_net(X_test)
test_loss = loss_fn(y_test, test_logits)
test_correct_pred = tf.equal(tf.argmax(test_logits, 1), tf.argmax(y_test, 1))
test_acc = tf.reduce_mean(tf.cast(test_correct_pred, tf.float32))
print(f"\nTest Loss: {test_loss.numpy():.4f}, Test Accuracy: {test_acc.numpy():.3f}")

Num GPUs Available:  0
Please install GPU version of TF
Note: '!pip install tensorflow-gpu==1.14.0' is for TensorFlow 1.x.
This code is written for TensorFlow 2.x, and the GPU is automatically utilized if available.
Epoch 001, Loss: 112.0244, Val Loss: 33.4330, Accuracy: 0.533, Val Acc: 0.743
Epoch 002, Loss: 25.1438, Val Loss: 19.7584, Accuracy: 0.790, Val Acc: 0.817
Epoch 003, Loss: 16.2872, Val Loss: 14.3949, Accuracy: 0.838, Val Acc: 0.848
Epoch 004, Loss: 11.9886, Val Loss: 11.4416, Accuracy: 0.865, Val Acc: 0.864
Epoch 005, Loss: 9.2795, Val Loss: 9.4317, Accuracy: 0.881, Val Acc: 0.877
Epoch 006, Loss: 7.4390, Val Loss: 8.1914, Accuracy: 0.893, Val Acc: 0.884
Epoch 007, Loss: 6.1069, Val Loss: 7.2564, Accuracy: 0.904, Val Acc: 0.889
Epoch 008, Loss: 5.0883, Val Loss: 6.5296, Accuracy: 0.912, Val Acc: 0.895
Epoch 009, Loss: 4.2969, Val Loss: 5.9366, Accuracy: 0.920, Val Acc: 0.900
Epoch 010, Loss: 3.6512, Val Loss: 5.4842, Accuracy: 0.926, Val Acc: 0.904
Epoch 011, Loss: 3.1175, 