# TensorFlow Playground
## January 2023
### by Michelle (Chelle) Davies
This notebook is my environment to practive using TensorFlow's features. Eventually, I will build more specific projects. For now, there's no (intentional) cohesive narrative with these datasets.

In [1]:
# imports
import tensorflow as tf
print("TensorFlow version:", tf.__version__)

TensorFlow version: 2.9.1


## Starting with a tutorial on the basics
Source: https://www.tensorflow.org/tutorials/quickstart/beginner

In [2]:
# load the preloaded dataset
mnist = tf.keras.datasets.mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0

In [3]:
mnist

<module 'keras.api._v2.keras.datasets.mnist' from '/Users/michelledavies/opt/anaconda3/lib/python3.9/site-packages/keras/api/_v2/keras/datasets/mnist/__init__.py'>

Next, I'm going to build a model. These are the options:
1. Keras Sequential Model
2. Keras Functional API

I'm building a `tf.keras.Sequential` model.

In [4]:
# Build a tf.keras.Sequential model:
model = tf.keras.models.Sequential([
  tf.keras.layers.Flatten(input_shape=(28, 28)),
  tf.keras.layers.Dense(128, activation='relu'),
  tf.keras.layers.Dropout(0.2),
  tf.keras.layers.Dense(10)
])

2023-01-28 20:45:12.319381: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  SSE4.1 SSE4.2 AVX AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.


In [5]:
# get predictions
predictions = model(x_train[:1]).numpy()
predictions

array([[-0.08305063, -0.384033  , -0.02630255,  0.07467729, -0.00108278,
         0.22612998, -0.38243416, -0.3361704 ,  0.7700122 ,  0.5251056 ]],
      dtype=float32)

In [6]:
# The tf.nn.softmax function converts these logits to probabilities for each class:
probabilities = tf.nn.softmax(predictions).numpy()
probabilities

array([[0.08251798, 0.0610708 , 0.08733613, 0.09661597, 0.08956674,
        0.11241493, 0.06116851, 0.06406488, 0.19365515, 0.15158893]],
      dtype=float32)

In [7]:
# Define a loss function for training using losses.SparseCategoricalCrossentropy:
loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)

In [8]:
loss_fn(y_train[:1], predictions).numpy()

2.1855586

Before training, configure and compile the model using Keras `Model.compile`. Set the optimizer class to adam, set the loss to the `loss_fn` function defined earlier, and specify a metric to be evaluated for the model by setting the metrics parameter to accuracy.

In [9]:
model.compile(optimizer='adam', loss=loss_fn, metrics=['accuracy'])

### Train and evaluate the model
Use the Model.fit method to adjust the model parameters and minimize the loss:

In [10]:
model.fit(x_train, y_train, epochs=5)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<keras.callbacks.History at 0x7f8583ac72b0>

The `Model.evaluate` method checks the model's performance, usually on a validation set or test set.

In [11]:
model.evaluate(x_test,  y_test, verbose=2)

313/313 - 0s - loss: 0.0752 - accuracy: 0.9775 - 411ms/epoch - 1ms/step


[0.07520777732133865, 0.9775000214576721]

The image classifier is now trained to ~98% accuracy on this dataset. To learn more, read the TensorFlow tutorials.

If you want your model to return a probability, you can wrap the trained model, and attach the softmax to it:

In [12]:
probability_model = tf.keras.Sequential([
  model,
  tf.keras.layers.Softmax()
])

In [13]:
probability_model(x_test[:5])

<tf.Tensor: shape=(5, 10), dtype=float32, numpy=
array([[6.00225150e-08, 6.36123376e-09, 2.23604980e-06, 1.13524198e-04,
        2.00967229e-10, 4.23381152e-07, 1.63287266e-11, 9.99879479e-01,
        9.60823243e-08, 4.19691833e-06],
       [5.01067632e-10, 1.24961953e-04, 9.99873996e-01, 1.07288713e-06,
        1.89501669e-17, 2.78646484e-09, 3.71616848e-10, 7.09916067e-14,
        1.43693475e-08, 6.90983051e-15],
       [6.86052601e-07, 9.98693049e-01, 1.02048260e-04, 8.30073441e-06,
        1.21576049e-05, 2.47017442e-06, 8.21821686e-06, 1.09229109e-03,
        7.84541917e-05, 2.22939798e-06],
       [9.99809086e-01, 8.07836853e-10, 2.67025280e-05, 4.40207586e-08,
        9.70679963e-08, 2.63008906e-06, 1.58644863e-04, 2.12504165e-06,
        5.88444893e-09, 7.26827295e-07],
       [4.25704030e-07, 1.83847229e-10, 9.49816695e-07, 1.08353246e-08,
        9.99755800e-01, 7.50508832e-07, 3.77229708e-06, 3.49173242e-05,
        1.18268838e-06, 2.02103984e-04]], dtype=float32)>

## My Own Experiment
Now, I'm going to make a project with my own data and exploration. I am going to explore creating OCR models with Tensorflow and Keras.

In [14]:
import numpy as np

In [16]:
# function to load the alphabet dataset
def load_az_dataset(datasetPath):
    # initialize the list of data and labels
    data = []
    labels = []
    # loop over the rows of the A-Z handwritten digit dataset
    for row in open(datasetPath):
        # parse the label and image from the row
        row = row.split(",")
        label = int(row[0])
        image = np.array([int(x) for x in row[1:]], dtype="uint8")
        # images are represented as single channel (grayscale) images
        # that are 28x28=784 pixels -- we need to take this flattened
        # 784-d list of numbers and repshape them into a 28x28 matrix
        image = image.reshape((28, 28))
        # update the list of data and labels
        data.append(image)
        labels.append(label)
    # convert the data and labels to NumPy arrays
    data = np.array(data, dtype="float32")
    labels = np.array(labels, dtype="int")
    # return a 2-tuple of the A-Z data and labels
    return (data, labels)
