<a href="https://colab.research.google.com/github/IvanOM-97/DPro-Exercises/blob/master/U44T2C96TensorflowSeriesAssignments.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
# @title
'''
TENSORFLOW

'''
!pip install TensorFlow



In [None]:
'''
PROBLEMA 1 - MIRANDO HACIA ATRAS DESDE SCRATCH
  Looking back at the scratches so far, when using scratch implementation of neural networks had to manually implement:

  1. Weight initialization
  2. I had to define forward propagation
  3. I needed to calculate loss, including mean squared error and cross-entropy.
  4. I implemented Backpropagation
  5. Implemented gradient parameter updates
  6. I needed an epoch loop
  7. I implemented Mini-batch processing
  8. I also did accuracy evaluation
  9. I also performed preprocessing of data data load, split, and normalize.
'''

In [1]:
'''
PROBLEMA 2 - CONSIDERANDO LA COPATIBILIDAD ENTRE SCRATCH Y TENSORFLOW
  This is how TensorFlow implements the following:

  1. Weight initialization - uses tf.Variable(tf.random_normal(...))
  2. Forward propagation - done via tf.matmul, tf.add, tf.nn.relu
  3. Loss calculation - achieved by tf.nn.sigmoid_cross_entropy_with_logits
  4. Backpropagation - one by optimizer.minimize(loss_op)
  5. Parameter update(gradient) - achieved by train_op = optimizer.minimize(...)
  6. Epoch loop - done via for epoch in range(num_epochs)
  7. Mini-batch handling - is GetMiniBatch class with for x, y in ...
  8. Accuracy calculation - implemented via tf.equal(...), tf.reduce_mean(...)
  9. Data preprocessing - achieved by pandas, numpy, and sklearn
'''

# Sample Code
"""
Using a neural network implemented in TensorFlow to perform binary classification on the Iris dataset
"""
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
import tensorflow.compat.v1 as tf
#import tensorflow as tf
tf.disable_v2_behavior()

tf.test.gpu_device_name()
"""
When changing the TensorFlow version to the 1.x series, don't forget to install the GPU version with "!pip install tensorflow-gpu==1.14.0".
Use tf.test.gpu_device_name() to check the GPU configuration status and verify if it is recognized.
If successful, logs will be output; if not recognized, nothing will be output.
"""

# Loading the dataset
df = pd.read_csv("Iris.csv")

# Conditional extraction from DataFrame
df = df[(df["Species"] == "Iris-versicolor") | (df["Species"] == "Iris-virginica")]
y = df["Species"]
X = df.loc[:, ["SepalLengthCm", "SepalWidthCm", "PetalLengthCm", "PetalWidthCm"]]

# Convert to NumPy array
X = np.array(X)
y = np.array(y)
# Convert labels to numerical values
y[y == "Iris-versicolor"] = 0
y[y == "Iris-virginica"] = 1
y = y.astype(np.int64)[:, np.newaxis]

# Split into train and test
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)
# Further split train into train and validation
X_train, X_val, y_train, y_val = train_test_split(X_train, y_train, test_size=0.2, random_state=0)

class GetMiniBatch:
    """
    Iterator to get mini-batches
      Parameters
      ----------
      X : ndarray of shape (n_samples, n_features)
        Training data
      y : ndarray of shape (n_samples, 1)
        Ground truth values
      batch_size : int
        Batch size
      seed : int
        NumPy random seed
    """
    def __init__(self, X, y, batch_size = 10, seed=0):
        self.batch_size = batch_size
        np.random.seed(seed)
        shuffle_index = np.random.permutation(np.arange(X.shape[0]))
        self.X = X[shuffle_index]
        self.y = y[shuffle_index]
        self._stop = np.ceil(X.shape[0]/self.batch_size).astype(int)
    def __len__(self):
        return self._stop
    def __getitem__(self,item):
        p0 = item*self.batch_size
        p1 = item*self.batch_size + self.batch_size
        return self.X[p0:p1], self.y[p0:p1]
    def __iter__(self):
        self._counter = 0
        return self
    def __next__(self):
        if self._counter >= self._stop:
            raise StopIteration()
        p0 = self._counter*self.batch_size
        p1 = self._counter*self.batch_size + self.batch_size
        self._counter += 1
        return self.X[p0:p1], self.y[p0:p1]

# Hyperparameter configuration
learning_rate = 0.001
batch_size = 10
num_epochs = 100

n_hidden1 = 50
n_hidden2 = 100
n_input = X_train.shape[1]
n_samples = X_train.shape[0]
n_classes = 1

# Define the shape of arguments to pass to the computational graph
X = tf.placeholder("float", [None, n_input])
Y = tf.placeholder("float", [None, n_classes])

# Mini-batch iterator for training
get_mini_batch_train = GetMiniBatch(X_train, y_train, batch_size=batch_size)

def example_net(x):
    """
    単純な3層ニューラルネットワーク
    """
    tf.random.set_random_seed(0)
    # Declaration of weights and biases
    weights = {
        'w1': tf.Variable(tf.random_normal([n_input, n_hidden1])),
        'w2': tf.Variable(tf.random_normal([n_hidden1, n_hidden2])),
        'w3': tf.Variable(tf.random_normal([n_hidden2, n_classes]))
    }
    biases = {
        'b1': tf.Variable(tf.random_normal([n_hidden1])),
        'b2': tf.Variable(tf.random_normal([n_hidden2])),
        'b3': tf.Variable(tf.random_normal([n_classes]))
    }

    layer_1 = tf.add(tf.matmul(x, weights['w1']), biases['b1'])
    layer_1 = tf.nn.relu(layer_1)
    layer_2 = tf.add(tf.matmul(layer_1, weights['w2']), biases['b2'])
    layer_2 = tf.nn.relu(layer_2)
    layer_output = tf.matmul(layer_2, weights['w3']) + biases['b3'] # tf.addと+は等価である
    return layer_output

# Loading the network structure
logits = example_net(X)

# Objective function
loss_op = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(labels=Y, logits=logits))
# Optimization method
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate)
train_op = optimizer.minimize(loss_op)

# Prediction results
correct_pred = tf.equal(tf.sign(Y - 0.5), tf.sign(tf.sigmoid(logits) - 0.5))
# Metric calculation
accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32))

# Variable initialization
init = tf.global_variables_initializer()


# Execution of the computational graph
with tf.Session() as sess:
    sess.run(init)
    for epoch in range(num_epochs):
        # Loop for each epoch
        total_batch = np.ceil(X_train.shape[0]/batch_size).astype(np.int64)
        total_loss = 0
        total_acc = 0
        for i, (mini_batch_x, mini_batch_y) in enumerate(get_mini_batch_train):
            # Loop for each mini-batch
            sess.run(train_op, feed_dict={X: mini_batch_x, Y: mini_batch_y})
            loss, acc = sess.run([loss_op, accuracy], feed_dict={X: mini_batch_x, Y: mini_batch_y})
            total_loss += loss
        total_loss /= n_samples
        val_loss, acc = sess.run([loss_op, accuracy], feed_dict={X: X_val, Y: y_val})
        print("Epoch {}, loss : {:.4f}, val_loss : {:.4f}, acc : {:.3f}".format(epoch, total_loss, val_loss, acc))
    test_acc = sess.run(accuracy, feed_dict={X: X_test, Y: y_test})
    print("test_acc : {:.3f}".format(test_acc))

Instructions for updating:
non-resource variables are not supported in the long term


Epoch 0, loss : 7.0241, val_loss : 67.6860, acc : 0.375
Epoch 1, loss : 3.4241, val_loss : 23.4026, acc : 0.312
Epoch 2, loss : 1.9387, val_loss : 11.6681, acc : 0.375
Epoch 3, loss : 2.0917, val_loss : 13.1400, acc : 0.312
Epoch 4, loss : 1.7685, val_loss : 17.7284, acc : 0.312
Epoch 5, loss : 1.6097, val_loss : 12.9607, acc : 0.312
Epoch 6, loss : 1.4402, val_loss : 10.0593, acc : 0.312
Epoch 7, loss : 1.3704, val_loss : 9.4797, acc : 0.312
Epoch 8, loss : 1.2536, val_loss : 9.8518, acc : 0.312
Epoch 9, loss : 1.1476, val_loss : 8.5670, acc : 0.375
Epoch 10, loss : 1.0930, val_loss : 8.0430, acc : 0.375
Epoch 11, loss : 1.0412, val_loss : 7.8791, acc : 0.375
Epoch 12, loss : 0.9804, val_loss : 7.1233, acc : 0.375
Epoch 13, loss : 0.9326, val_loss : 6.7908, acc : 0.375
Epoch 14, loss : 0.8792, val_loss : 6.2492, acc : 0.375
Epoch 15, loss : 0.8304, val_loss : 5.7680, acc : 0.375
Epoch 16, loss : 0.7835, val_loss : 5.2886, acc : 0.438
Epoch 17, loss : 0.7384, val_loss : 4.8037, acc : 0

In [None]:
# PROBLEMA 3 - CREA A MODELO PARA IRIS UTILIZANDO LAS 3 VARIABLES TARGET (CLASIFICACION)
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelBinarizer
import tensorflow.compat.v1 as tf
tf.disable_v2_behavior()

# Loading Iris Dataset
#from google.colab import files
#uploaded = files.upload()

df = pd.read_csv("Iris.csv")
X = df[["SepalLengthCm", "SepalWidthCm", "PetalLengthCm", "PetalWidthCm"]].values
y = df["Species"].values

# One-hot encode labels
encoder = LabelBinarizer()
y = encoder.fit_transform(y)

# Split data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)
X_train, X_val, y_train, y_val = train_test_split(X_train, y_train, test_size=0.2, random_state=0)

# Hyperparameters
learning_rate = 0.001
batch_size = 10
num_epochs = 100
n_input = X_train.shape[1]
n_hidden1, n_hidden2 = 50, 100
n_classes = 3
n_samples = X_train.shape[0]

# Placeholders
X_ph = tf.placeholder(tf.float32, [None, n_input])
Y_ph = tf.placeholder(tf.float32, [None, n_classes])

# Weights and biases
weights = {
    'w1' : tf.Variable(tf.random_normal([n_input, n_hidden1])),
    'w2' : tf.Variable(tf.random_normal([n_hidden1, n_hidden2])),
    'w3' : tf.Variable(tf.random_normal([n_hidden2, n_classes]))
}
biases = {
    'b1' : tf.Variable(tf.random_normal([n_hidden1])),
    'b2' : tf.Variable(tf.random_normal([n_hidden2])),
    'b3' : tf.Variable(tf.random_normal([n_classes]))
}

# Model
def model(x):
    l1 = tf.nn.relu(tf.add(tf.matmul(x, weights['w1']), biases['b1']))
    l2 = tf.nn.relu(tf.add(tf.matmul(l1, weights['w2']), biases['b2']))
    return tf.add(tf.matmul(l2, weights['w3']), biases['b3'])

logits = model(X_ph)

# Loss and optimizer
loss_op = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=Y_ph, logits=logits))
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(loss_op)

# Accuracy
correct_pred = tf.equal(tf.argmax(logits, 1), tf.argmax(Y_ph, 1))
accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32))

init = tf.global_variables_initializer()

# Training
with tf.Session() as sess:
    sess.run(init)
    for epoch in range(num_epochs):
        for i in range(0, n_samples, batch_size):
            x_batch = X_train[i:i+batch_size]
            y_batch = y_train[i:i+batch_size]
            sess.run(optimizer, feed_dict={X_ph: x_batch, Y_ph: y_batch})

        if epoch % 10 == 0:
            val_loss, val_acc = sess.run([loss_op, accuracy], feed_dict={X_ph: X_val, Y_ph: y_val})
            print(f"Epoch {epoch}, Validation Loss: {val_loss:.4f}, Validation Accuracy: {val_acc:.4f}")

    test_acc = sess.run(accuracy, feed_dict={X_ph: X_test, Y_ph: y_test})
    print(f"Test Accuracy: {test_acc:.4f}")

Instructions for updating:

Future major versions of TensorFlow will allow gradients to flow
into the labels input on backprop by default.

See `tf.nn.softmax_cross_entropy_with_logits_v2`.



Epoch 0, Validation Loss: 98.3034, Validation Accuracy: 0.4583
Epoch 10, Validation Loss: 0.6594, Validation Accuracy: 0.8333
Epoch 20, Validation Loss: 0.6042, Validation Accuracy: 0.9167
Epoch 30, Validation Loss: 0.5235, Validation Accuracy: 0.8750
Epoch 40, Validation Loss: 0.4777, Validation Accuracy: 0.9583
Epoch 50, Validation Loss: 0.4499, Validation Accuracy: 0.9167
Epoch 60, Validation Loss: 0.4432, Validation Accuracy: 0.9167
Epoch 70, Validation Loss: 0.4385, Validation Accuracy: 0.8750
Epoch 80, Validation Loss: 0.4333, Validation Accuracy: 0.8750
Epoch 90, Validation Loss: 0.4284, Validation Accuracy: 0.8750
Test Accuracy: 1.0000


In [None]:
# PROBLEMA 4 - CREA UN MODELO PARA HOUSE PRICES (REGRESION)
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
import tensorflow.compat.v1 as tf
#import tensorflow as tf
tf.disable_v2_behavior()

# Loading the dataset
df = pd.read_csv("train.csv")
df = df[["GrLivArea", "YearBuilt", "SalePrice"]].dropna()
X = df[["GrLivArea", "YearBuilt"]].values
y = df["SalePrice"].values.reshape(-1, 1)

# Normalizing
X = (X - X.mean(axis=0)) / X.std(axis=0)
y = (y - y.mean()) / y.std()

# Split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)
X_train, X_val, y_train, y_val = train_test_split(X_train, y_train, test_size=0.2)

# Hyperparameter configuration
batch_size = 10

# Placeholder
X_ph = tf.placeholder(tf.float32, [None, 2])
y_ph = tf.placeholder(tf.float32, [None, 1])

# Weigghts and biases
W1 = tf.Variable(tf.random_normal([2, 64]))
b1 = tf.Variable(tf.zeros([64]))
W2 = tf.Variable(tf.random_normal([64, 1]))
b2 = tf.Variable(tf.zeros([1]))

# Model
l1 = tf.nn.relu(tf.matmul(X_ph, W1) + b1)
output = tf.matmul(l1, W2) + b2

# Loss and optimizer
loss = tf.reduce_mean(tf.square(output - y_ph))
optimizer = tf.train.AdamOptimizer(0.01).minimize(loss)

# Training
init = tf.global_variables_initializer()
with tf.Session() as sess:
    sess.run(init)
    for epoch in range(100):
        for i in range(0, len(X_train), batch_size):
            x_batch = X_train[i:i+10]
            y_batch = y_train[i:i+10]
            sess.run(optimizer, feed_dict={X_ph: x_batch, y_ph: y_batch})

        if epoch % 10 == 0:
            val_loss = sess.run(loss, feed_dict={X_ph: X_val, y_ph: y_val})
            print(f"Epoch {epoch}, Validation Loss: {val_loss:.4f}")

    test_loss = sess.run(loss, feed_dict={X_ph: X_test, y_ph: y_test})
    print(f"Test Loss: {test_loss:.4f}")

Epoch 0, Validation Loss: 0.4406
Epoch 10, Validation Loss: 0.3585
Epoch 20, Validation Loss: 0.3512
Epoch 30, Validation Loss: 0.3416
Epoch 40, Validation Loss: 0.3471
Epoch 50, Validation Loss: 0.3272
Epoch 60, Validation Loss: 0.3051
Epoch 70, Validation Loss: 0.2885
Epoch 80, Validation Loss: 0.2828
Epoch 90, Validation Loss: 0.2850
Test Loss: 0.5117


In [None]:
# PROBLEMA 5 - CREA UN MODELO MNIST (CLASIFICACION DE IMAGENES)
import numpy as np
import tensorflow.compat.v1 as tf
tf. disable_v2_behavior()

# Loading MNIST from tf.keras
from tensorflow.keras.datasets import mnist
(X_train, y_train), (X_test, y_test) = mnist.load_data()

# Normalizing and reshaping
x_train = X_train.reshape(-1, 784) / 255.0
x_test = X_test.reshape(-1, 784) / 255.0

# One-hot encode labels
from sklearn.preprocessing import OneHotEncoder
enc = OneHotEncoder(sparse_output=False)
y_train = enc.fit_transform(y_train.reshape(-1, 1))
y_test = enc.transform(y_test.reshape(-1, 1))

# Splitting validation set
x_val = x_train[-5000:]
y_val = y_train[-5000:]
x_train = x_train[:-5000]
y_train = y_train[:-5000]

# Placeholders
X = tf.placeholder(tf.float32, [None, 784])
Y = tf.placeholder(tf.float32, [None, 10])

# Model parameters
W1 = tf.Variable(tf.random_normal([784, 128]))
b1 = tf.Variable(tf.zeros([128]))
W2 = tf.Variable(tf.random_normal([128, 10]))
b2 = tf.Variable(tf.zeros([10]))

# Building model
l1 = tf.nn.relu(tf.matmul(X, W1) + b1)
logits = tf.matmul(l1, W2) + b2

# Loss and optimizer
loss_op = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=Y, logits=logits))
optimizer = tf.train.AdamOptimizer(learning_rate=0.001).minimize(loss_op)

# Accuracy
correct_pred = tf.equal(tf.argmax(logits, 1), tf.argmax(Y, 1))
accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32))

# Training
init = tf.global_variables_initializer()
with tf.Session() as sess:
    sess.run(init)
    for epoch in range(10):
        for i in range(0, len(x_train), 100):
            x_batch = x_train[i:i+100]
            y_batch = y_train[i:i+100]
            sess.run(optimizer, feed_dict={X: x_batch, Y: y_batch})

        val_acc = sess.run(accuracy, feed_dict={X: x_val, Y: y_val})
        print(f"Epoch {epoch}, Validation Accuracy: {val_acc:.4f}")

    test_acc = sess.run(accuracy, feed_dict={X: x_test, Y: y_test})
    print(f"Test Accuracy: {test_acc:.4f}")


Instructions for updating:

Future major versions of TensorFlow will allow gradients to flow
into the labels input on backprop by default.

See `tf.nn.softmax_cross_entropy_with_logits_v2`.



Epoch 0, Validation Accuracy: 0.8202
Epoch 1, Validation Accuracy: 0.8754
Epoch 2, Validation Accuracy: 0.8926
Epoch 3, Validation Accuracy: 0.9036
Epoch 4, Validation Accuracy: 0.9112
Epoch 5, Validation Accuracy: 0.9182
Epoch 6, Validation Accuracy: 0.9226
Epoch 7, Validation Accuracy: 0.9276
Epoch 8, Validation Accuracy: 0.9318
Epoch 9, Validation Accuracy: 0.9342
Test Accuracy: 0.9230
