In [1]:
!pip install tensorflow

ERROR: Could not find a version that satisfies the requirement tensorflow (from versions: none)
ERROR: No matching distribution found for tensorflow


In [2]:
import numpy as np
import tensorflow as tf
from tensorflow.keras import layers
import matplotlib.pyplot as plt
from sklearn.metrics import confusion_matrix

ModuleNotFoundError: No module named 'tensorflow'

#TensorFlow
TensorFlow is like a toolbox made by Google to help you build and train machine learning and deep learning models.
You can use it to build models for things like:
*   Image recognition
*   Language translation
*   Predicting stock prices
---
#Keras
Keras is like a friendly interface (a helper) that sits on top of TensorFlow.
* It lets you build AI models more easily using fewer lines of code.
* Instead of doing all the complicated stuff manually, Keras gives you simple blocks (like layers, models, optimizers) to stack and play with.
* It’s like using LEGO blocks to build a house instead of carving each brick by hand.
---
#`tensorflow.keras.layers`
In `tensorflow.keras`, the `layers` module gives you building blocks to create neural networks.
* Each layer takes input, does some math, and passes output to the next layer.
* The layers module has pre-made layer types like:
    * `Dense` – fully connected layer (every neuron is connected to every neuron in the next layer)
    * `Conv2D` – for images (convolutional layer)
    * `Dropout` – randomly turns off some neurons to prevent overfitting
    * `Flatten`, `ReLU`, `LSTM`, etc.

Instead of writing math manually, you can say:
```
from tensorflow.keras import layers

# A layer with 64 neurons and ReLU activation
my_layer = layers.Dense(64, activation='relu')
```
This one line gives you a full, trainable layer in a neural network.

Without Keras all training bits have to be manually written including writing the activation function and updating the weight and gradients. But TensorFlow does give you the advantage to use highly optimized data, hardware acceleration, methods to calculate and update gradients/weights etc. Therefore TensorFlow is useful in building ML models for specific training purposes but if you want to use already established models like CNN or RNN, its better and easier to use Keras.








In [None]:
inputs = tf.keras.Input(shape=(784,))

This line creates a placeholder for input data that will be fed into your model.

`tf.keras.Input(...)`
* This creates a Keras input layer — the entry point of your neural network.
* It tells the model: "This is what kind of data I expect."

`shape=(784,)`
* This means each input sample has 784 features.
* Common in flattened images — for example:
  * A 28x28 grayscale image = 784 pixels → flattened to a vector
  * Like in the MNIST digit datase

In [None]:
dense = layers.Dense(64 ,activation ="relu")
x = dense (inputs)

Here `dense` is a neural network that takes input and converts to 64 outputs. `x` will store the outputs from he neural network after processing the `inputs`. It takes 784 data points and gives 64 outputs. Therefore `x` stores 64 values.

In [None]:
x = layers.Dense(64, activation ="relu")(x)
outputs = layers.Dense(10)(x)

We then add the next layer, which further extracts features. This will process `x` from the previous step and
result in a new `x` that is also 64-dimensional, before passing those outputs to a final layer that is 10-dimensional. The outputs of that layer go into a variable called `outputs`.

Each of those outputs represents one of the digits. Having defined the layers of the network we can now construct
the model:

In [None]:
model = tf.keras.Model(inputs=inputs, outputs=outputs, name="mnist_model")
model.summary()

`model = tf.keras.Model(inputs=inputs, outputs=outputs, name="mnist_model")`

This line creates a Keras model object — the actual neural network you can train, evaluate, and use to make predictions.

`inputs=inputs`

This is the input layer you defined earlier. It tells the model: "Here’s where the data enters the network."

`outputs=outputs`

This is the final layer of your model — the output layer.
It tells the model: "Here’s what the model should return."

`name="mnist_model"`

This gives your model a name (optional), useful for saving/loading or viewing in TensorBoard.

Loading the data:

In [None]:
(x_train,y_train),(x_test, y_test) = tf.keras.datasets.mnist.load_data()
x_train = x_train.reshape(60000,784).astype("float32")/255
x_test = x_test.reshape (10000,784 ).astype ("float32")/ 255

`.reshape(60000, 784)` This flattens each 28×28 image into a 1D vector of 784 values. This is needed because many basic neural networks (like Dense layers) expect flat input vectors, not 2D images.

`/ 255` Normalizes the pixel values to be between 0 and 1 instead of 0 to 255. This makes training faster and more stable.

\
\
Having loaded the data, compile it as follows. You will need to specify a loss function (to determine how good
the predictions are), an optimiser, and accuracy metrics to report on how well the predictions do. You can then
train the model.

In [None]:
model.compile(
  loss = tf.keras.losses.SparseCategoricalCrossentropy(from_logits = True),
  optimizer = tf.keras.optimizers.RMSprop(),
  metrics =["accuracy"],)

history = model.fit(x_train, y_train, batch_size = 64, epochs = 2, validation_split = 0.2)

test_scores = model.evaluate(x_test, y_test, verbose=2)
print("Test accuracy :", test_scores [1])

`batch_size=64`

The model doesn’t train on all 60,000 samples at once. Instead, it processes them in mini-batches of 64 samples at a time. This makes training faster and more memory-efficient.

`epochs=2`

An epoch = one full pass through all the training data.
Here, the model will go through the full training set 2 times.

`validation_split=0.2`

20% of x_train and y_train is set aside for validation. The model uses that 20% to test itself after each epoch (but doesn’t learn from it).

#Confusion Matrix of the trained model:

In [None]:
predictions = model.predict(x_test).argmax(axis=1)

plt.figure()
plt.imshow(confusion_matrix(y_test, predictions))
plt.show()

The result of the predict method is 10 outputs for each input image. These represent the likelihood that the image represents each image, so there is the likelihood that the image is a 0, the likelihood that its a 1, and
so on. What you need to do is find the largest of these, which will give you the most likely prediction - the
argmax method will do this for you. The axis=1 argument will return the highest value in each row so that
you can do the whole lot in one go.