HOW TO CONDUCT A NEAR TO PERFECT MACHINE LEARNING PROJECT:
    1. Examine your problem 
        its good to know what type of data you are working with(Numerical, Categorical, or images).
        what is the end goal of your model?
        How will I define and measure 'accuracy'?
        Do I have an idea of what algorithm will work best on this type of project and data?

    2. prepare your data(raw data, feature extraction, feature engineering)
        Here you load the data from your disk, examine it, and decide whether you need to perfom feature extraction or 
        feature engineering.
            #Feature extraction involves applying an algorithm to quantify your data
            #feature engineering involves transforming your raw input data into a representation that better represent the 
            underlying the problem. 

    3. Spot check different algorithms
        Here you simply take a set of various ML Algorithms and apply them to the dataset with a goal of gaining an 
        experience to your project and understanding which ML Algorithm performs well on the problem and which one do not 
        do well. 
        
    4. Examine the results
        Here you are able to understand which machine learning model worked well and which did not work well. 
        You are able also to see patterns that emerge across multiple experiments you have done. 
        You discover which and in when does a specific algorithm peform well. 

    5. Double down on the algorithms that worked best

Some of the different algorithms you can use:
    1. K-Nearest Neighbors(K-NN)
    2. Naive Bayes
    3. Logistic Regression
    4. Support Vector Machines(SVMs)
    5. Decision Trees
    6. Random Forest
    7. Convolutional Neural Network (CNN)
    

#Creating the first neural network using MNIST FASHION DATASET
#MNIST(Modified National Institute of Standards and Technology)

#the dataset contain images of 10 different types of clothes from shirts, trousers, dresses etc.
#every image grid has a size of 28 * 28 with pixel value between 0 and 255


In [25]:
#importing tensorflow package
import tensorflow as tf

            Loading Mnist dataset from keras

In [26]:
data = tf.keras.datasets.fashion_mnist

In [27]:
(training_images,training_labels), (test_images, test_labels) = data.load_data()

the downloaded data has 60,000 training images and 10,000 test images and thus running data.load_data code gives an array of 60,000 28*28 pixel arrays(training images) & 60,000 values (0-9) training labels

the test images and test labesl is held back so as the model doe not see them during training.

NORMALIZING
the model is normalized to improve performance

In [28]:
#all the pixels in the images in data are grayscale with values between 0 - 255.test_images.test_images
# dividing by 255 ensures every pixel gets represented by a single number either 1 or 0.
#  this process is called normalizing
training_images = training_images / 255.0

test_images = test_images /255.0

Defining Neural Network

Flatten- the input layer with inputs of 28*28 images. It turns the gray boxes which have a square value (2D) and convert it into 1D.
Dense- a layer of neurons(hidden layer). selects 128 neurons randomly in order to have their internal parameters randomly initialized. 
#You can not select more Neurons as the model will run slowly, and overfitting(the neural network recognizes training data better than it recognizes new data)
#fewer neurons also means the network will not have enough parameters to learn.
You select the right values through a process called #Hyperparameter tuning



the activation function is a code that will execute on each neuron in the layer. the one used is relu(Rectified Linear Unit). it returns a value if its greater than 0. 

Second Dense layer- this is the output layer and has 10 neurons since we have 10 different classes. each neuron ends up with a probability of each input matching to the classes, so we use softmax activation function to match with th one with the highest probability.

Hyperparameter- value that is used to control training of a network
parameter- internal values of neurons that get trained/learned.

In [29]:
model = tf.keras.models.Sequential([
    tf.keras.layers.Flatten(input_shape = (28,28)),
    tf.keras.layers.Dense(128,activation = tf.nn.relu),
    tf.keras.layers.Dense(10,activation = tf.nn.softmax)
])

compiling the model by specifying the loss function.


sparse categorical cross entropy is a Categorical loss function built into tensorflow.


In [30]:
model.compile(optimizer='adam',
loss = 'sparse_categorical_crossentropy',
metrics=['accuracy'])#reporting back accuracy

Training the network

In [31]:
model.fit(training_images, training_labels, epochs=5)#here the code trains the model epoch by epoch
#it reports the accuracy and loss, when the loss decreases, the accuracy increases as the number of epochs increases. 

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<keras.callbacks.History at 0x19d79f84fa0>

        Evaluating the model on the test data. 
#here we get to see the accuracy of the model.
#the accuracy reported might be lower than the accuracy in the training data. 
this is because the neural network only knows how to much inputs it has been trained on with output of those values. 


In [32]:
model.evaluate(test_images, test_labels)




[0.3551686406135559, 0.8715000152587891]

                Exploring the Model Output
the classification gives back an array of values of the 10 output neurons.
the values are probabilities that an image matches the label at a given index.  
the results are after using 5 epochs and looking at a test image index 1. the neural network reports that there's a 99.9% chance the  item of clothing at index 0 is labeled 9. 

when you try a different index like 9, the accuracy gets to 68.9% chance of recognizing label 7.

when you train for longer (increase the number of epochs), the accuracy increases in both test set and training set. 
#Training for long does not always result to a good model, it can cause overfitting. Its better to know what accuracy you want your model to attain with that, you are able to play around with the values. 

In [33]:
classifications = model.predict(test_images)
print(classifications[9])
print(test_labels[9])

[5.2384093e-09 5.4846145e-09 2.0803914e-09 2.4999158e-08 2.4036526e-10
 3.2031487e-04 1.6671866e-09 9.9931872e-01 1.1795587e-06 3.5976828e-04]
7


if you want to attain a certain accuracy without knowing how many epochs will help you attain the accuracy, you can use a #CALLBACK during training

lets see what will happen!

In [34]:
class myCallback(tf.keras.callbacks.Callback):
    def on_epoch_end(self, epoch, logs={}):
        if(logs.get('accuracy')>0.95):
            print("\nReached 95% accuracy so cancelling training!")
            self.model.stop_training = True

In [35]:
callbacks = myCallback()
mnist = tf.keras.datasets.fashion_mnist

NOTE: here we have used 60 epochs and we are adding the callback parameter and then passing the callbacks object. in the class we define the on_epoch_end function that gives the details for the logs of the epoch. 
The callback function is called during training and at epoch 34, the training stops since the model has attained the target of 95% accuracy.
If the accuracy gets higher than 95% as we have specified, the model stops training (self.model.stop_training = True)



In [37]:
model.fit(training_images,training_labels,epochs=50,
callbacks=[callbacks])

Epoch 1/50
Reached 95% accuracy so cancelling training!


<keras.callbacks.History at 0x19d798f7220>