In [2]:
pip install tensorflow

Collecting tensorflow
  Downloading tensorflow-2.17.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (4.2 kB)
Collecting absl-py>=1.0.0 (from tensorflow)
  Downloading absl_py-2.1.0-py3-none-any.whl.metadata (2.3 kB)
Collecting astunparse>=1.6.0 (from tensorflow)
  Downloading astunparse-1.6.3-py2.py3-none-any.whl.metadata (4.4 kB)
Collecting flatbuffers>=24.3.25 (from tensorflow)
  Downloading flatbuffers-24.3.25-py2.py3-none-any.whl.metadata (850 bytes)
Collecting gast!=0.5.0,!=0.5.1,!=0.5.2,>=0.2.1 (from tensorflow)
  Downloading gast-0.6.0-py3-none-any.whl.metadata (1.3 kB)
Collecting google-pasta>=0.1.1 (from tensorflow)
  Downloading google_pasta-0.2.0-py3-none-any.whl.metadata (814 bytes)
Collecting h5py>=3.10.0 (from tensorflow)
  Downloading h5py-3.12.1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (2.5 kB)
Collecting libclang>=13.0.0 (from tensorflow)
  Downloading libclang-18.1.1-py2.py3-none-manylinux2010_x86_64.whl.metadata (5.2 k

In [3]:
pip install keras

Note: you may need to restart the kernel to use updated packages.


In [2]:
from tensorflow.keras.datasets import mnist

2024-10-21 15:58:19.321104: I external/local_xla/xla/tsl/cuda/cudart_stub.cc:32] Could not find cuda drivers on your machine, GPU will not be used.
2024-10-21 15:58:23.509654: I external/local_xla/xla/tsl/cuda/cudart_stub.cc:32] Could not find cuda drivers on your machine, GPU will not be used.
2024-10-21 15:58:25.964809: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:485] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-10-21 15:58:29.647319: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:8454] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-10-21 15:58:30.555400: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1452] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-10-21 15:58:36.199500: I tensorflow/core/platform/cpu_feature_gu

- The MNIST dataset contains four numpy arrays: 
1. `train_images` and `train_labels` are the training set 
2. `test_images` and `test_labels` form the test set
3. Images are NumPy arrays and labels are arrays of digits (range: 0-9)

In [4]:
(train_images, train_labels), (test_images, test_labels) = mnist.load_data()

- Review the training data

In [5]:
train_images.shape

(60000, 28, 28)

In [6]:
len(train_labels)

60000

In [7]:
train_labels

array([5, 0, 4, ..., 5, 6, 8], dtype=uint8)

In [8]:
test_images.shape

(10000, 28, 28)

In [9]:
len(test_labels)

10000

In [10]:
test_labels

array([7, 2, 1, ..., 4, 5, 6], dtype=uint8)

- Build the network

In [11]:
from tensorflow import keras
from tensorflow.keras import layers
model = keras.Sequential([layers.Dense(512, activation="relu"), 
                          layers.Dense(10, activation="softmax")])

### Notes on Neural Networks:
- **layer** : core building block of the neural networks; serves as filter of data; each layer extracts representations of data fed into it; layers can be chained together to perform progressive data distillation

- **Dense layer** : densely connected (_fully connected_) neural layer 
 (e.g. other type of dense layer: **softmax classification layer** : returns arrays of of x probability scores that sum up to 1; score represents probability that current sample belongs to one of the x classes)

 #### Preparing Neural Network for Training : A Checklist

 - What is the **optimizer**? : The optimizer is the mechanism the model uses to _update itself based on training data it consumes_ with the ultimate goal being to improve its performance

 - What is the **loss function**? This function enables the model to measure its performance on the training data 

 - What **metrics** will be monitored during training and testing? This may vary but typically _accuracy_ is the metric most often monitored 


In [12]:
# Compilation step
model.compile(optimizer="adam",
              loss="sparse_categorical_crossentropy",
              metrics=["accuracy"])

In [14]:
#Prepare training data - reshape data into shape that model expects; scale so that all values 
# are in [0,1] interval
#Refer to cells above to see how data has changed

train_images = train_images.reshape((60000, 28 * 28)) #reshape images from (60000,28,28)
train_images = train_images.astype("float32") / 255 #convert image type from uint8 to float32
test_images = test_images.reshape((10000, 28 * 28)) #reshape test images from 10000, 28, 28
test_images = test_images.astype("float32") / 255 # the '/ 255 ' bit is to scale data from [0,255] to [0,1] interval 

In [15]:
#Train Model (Note: data is now "consumable" because we reshaped and retyped it)
model.fit(train_images, train_labels,epochs=5, batch_size=128)
#Notice in the output that accuracy and loss of the model over the training data are calculated

Epoch 1/5


2024-10-21 16:07:53.957274: W external/local_tsl/tsl/framework/cpu_allocator_impl.cc:83] Allocation of 188160000 exceeds 10% of free system memory.


[1m469/469[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 5ms/step - accuracy: 0.5000 - loss: 1.8699
Epoch 2/5
[1m469/469[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 5ms/step - accuracy: 0.8514 - loss: 0.6180
Epoch 3/5
[1m469/469[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 5ms/step - accuracy: 0.8875 - loss: 0.4224
Epoch 4/5
[1m469/469[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 5ms/step - accuracy: 0.8986 - loss: 0.3578
Epoch 5/5
[1m469/469[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 5ms/step - accuracy: 0.9066 - loss: 0.3294


<keras.src.callbacks.history.History at 0x71c0d616e210>

In [19]:
#Use the model to make predictions: 
test_digits = test_images[0:10] # look at the first 10 images
predictions = model.predict(test_digits) # model makes predictions on set of 10 images we selected
predictions[0] #Let's take a look at the first of these predictions

#Interpret output: Any number at index (i) in the output array corresponds to a probability [0,1] that the sample at that index belongs to class i

[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 21ms/step


array([0.        , 0.        , 0.        , 0.        , 0.        ,
       0.        , 0.        , 0.99999994, 0.        , 0.        ],
      dtype=float32)

In [20]:
# Identify the predictiont the model makes : i.e. the number it believes matches the image
predictions[0].argmax()

7

In [21]:
#Retrieve the probability that the image at the 0th index is the number 7
predictions[0][7]

0.99999994

In [22]:
#Compare this result with the test labels
test_labels[0]

7

In [23]:
#Now look at newer data
test_loss, test_acc = model.evaluate(test_images, test_labels)
print (f"test_acc: {test_acc}")
#The resulting output may be a symptom of overfitting 

[1m 27/313[0m [32m━[0m[37m━━━━━━━━━━━━━━━━━━━[0m [1m0s[0m 2ms/step - accuracy: 0.7911 - loss: 46.3404  

2024-10-21 17:11:48.008067: W external/local_tsl/tsl/framework/cpu_allocator_impl.cc:83] Allocation of 31360000 exceeds 10% of free system memory.


[1m313/313[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 2ms/step - accuracy: 0.8029 - loss: 54.2387
test_acc: 0.8248000144958496
