Tensorflow is a really powerful framework for machine learning. For my purposes of simple feed-forward neural networks, Keras would be enough.

In [13]:
import tensorflow as tf
from tensorflow.keras import layers, models

Later, I will make guides on how to use custom datasets from local memory, but for now, the following is how you download and use publically available datasets. We are normalizing the greyscale images, so that, instead of ranging from 0 to 225, they now range from 0 to 1, which is really convenient.

In [14]:
(train_images, train_label), (test_images, test_labels) = tf.keras.datasets.mnist.load_data()
train_images = train_images.astype('float32') / 225
test_images = test_images.astype('float32') / 225

import matplotlib.pyplot as plt

n = 958
plt.imshow(train_images[n], cmap='gray')
plt.title(f"Label: {train_label[n]}")
plt.show()

train_images and test_images objects are tensors. I can create my own tensors using the tf.Variable() and tf.zeros() methods. train_images.shape() is going to tell us the dimensions of this tensor.

In [15]:
model = models.Sequential([
    layers.Input(shape=(28,28,1)),
    layers.Flatten(),
    layers.Dense(128,activation='relu'),
    layers.Dense(64,activation='relu'),
    layers.Dense(10,activation='softmax')
])

The above piece of code is really important. It is vital that everything there must be understood. The 'Sequential' method is used for feed-forward ordinary neural networks, which I will be using almost all of the times. The first layer is the input layer of our neural network. The 'Flatten' method is used to convert the input matrix to a vector i.e. flatten it out. In our case, the second and third layers are what are called 'hidden layers'. These hidden layers can have as many neurons as possible. The activation function is what normalizes the output of neurons to a specific format. The 'relu' activation function stands for rectified linear unit, which is good enough for most of our applications. The 'Dense' method refers to the fact that the neurons are densly connected to each other i.e. each neuron is connected to every other neuron after it. The last layer is always the output layer, containing as many neurons as our outputs. The 'softmax' activation function in the output is used when we want a probablistic output, normalized to unity. 

In [16]:
model.compile(optimizer='adam',loss='sparse_categorical_crossentropy',metrics=['accuracy'])

Optimizer refers to the backpropogation algorithm that adjusts the weights. 'adam' is pretty good and should be left as is. Similarly, loss function refers to the method used for quantizing how much closer we are to the desired results. As previously, the current one should be left as is, unless you have a good reason.

In [17]:
model.fit(train_images,train_label,epochs=10)

Epoch 1/10
[1m1875/1875[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 2ms/step - accuracy: 0.8751 - loss: 0.4166
Epoch 2/10
[1m1875/1875[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 2ms/step - accuracy: 0.9692 - loss: 0.1015
Epoch 3/10
[1m1875/1875[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 2ms/step - accuracy: 0.9802 - loss: 0.0644
Epoch 4/10
[1m1875/1875[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 2ms/step - accuracy: 0.9840 - loss: 0.0506
Epoch 5/10
[1m1875/1875[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 2ms/step - accuracy: 0.9873 - loss: 0.0394
Epoch 6/10
[1m1875/1875[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 3ms/step - accuracy: 0.9900 - loss: 0.0304
Epoch 7/10
[1m1875/1875[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 2ms/step - accuracy: 0.9918 - loss: 0.0249
Epoch 8/10
[1m1875/1875[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 2ms/step - accuracy: 0.9930 - loss: 0.0214
Epoch 9/10
[1m1875/1875

<keras.src.callbacks.history.History at 0x7f4cb2b10e50>

Here we train the model by specifying the input data (train_images) and the actual data the input is supposed to correspond to (train_label). Epochs are how many times the training data is fed to the model. They should not be too high to avoid overfitting. A good balance should be determined based on the training accuracy and the evaluated accuracy.

In [18]:
test_loss, test_accuracy = model.evaluate(test_images, test_labels)

[1m313/313[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 1ms/step - accuracy: 0.9709 - loss: 0.1214


That's it! If your fundamentals are good, training and using simple neural networks using Tensorflow isn't that hard. Now you can modify the structure of this code as per your needs. Best of Luck!!