# Loading/Preprocessing Image Data

## for image processing:
* 1- your data directory should be as follows:
    - there is train folder for training data , as well as testing folder for the test
    - inside every branch of these, put as many folders as your classes, with each class image in it's folder
    - then at the function's location point at the main brach of either train folder or test folder
    - the function will then sort every image in the classes folder and give it a proper label
    - it will also name the main classes with their proper labels
    - as it will create label_1 --> label_n, and inside each label it's own data (label_1 0 ->m , label_n 0 -> m)
----------
* 2- importing the used function `ImageDataGenerator` :

            from tensorflow.keras.preprocessing.image import ImageDataGenerator
--------------          
* 3- create the train_gen/vali_gen object  :
    - assign the preprocessing you need inside the image generator, for exampel scaling
    
             train_gen= ImageDataGenerator( rescale=1.0/255 , rotation_range=40, width_shift_range=0.3,
                                  height_shift_range=0.3, shear_range=0.2, zoom_range=0.2,
                                  fill_mode='nearest', horizontal_flip=True , validation_split=0.2 )
    
      *  `featurewise_center`::	Boolean. Set input mean to 0 over the dataset, feature-wise.
      *  `samplewise_center`::	Boolean. Set each sample mean to 0.
      *  `featurewise_std_normalization`::	Boolean. Divide inputs by std of the dataset, feature-wise.
      *  `samplewise_std_normalization`::	Boolean. Divide each input by its std.
      *  `zca_epsilon`::	epsilon for ZCA whitening. Default is 1e-6.
      *  `zca_whitening`::	Boolean. Apply ZCA whitening.
      *  `rotation_range`::	Int. Degree range for random rotations.
      *  `width_shift_range`::	Float, 1-D array-like or int
      
          *  float: fraction of total width, if < 1, or pixels if >= 1.
          *  1-D array-like: random elements from the array.
          *  int: integer number of pixels from interval (-width_shift_range, +width_shift_range)
          *  With width_shift_range=2 possible values are integers [-1, 0, +1], same as with width_shift_range=[-1, 0, +1], while with width_shift_range=1.0 possible values are floats in the interval [-1.0, +1.0).
          
      *  `height_shift_range`::	Float, 1-D array-like or int _
      
          *  float: fraction of total height, if < 1, or pixels if >= 1.
          *  1-D array-like: random elements from the array.
          *  int: integer number of pixels from interval (-height_shift_range, +height_shift_range)
          *  With height_shift_range=2 possible values are integers [-1, 0, +1], same as with height_shift_range=[-1, 0, +1], while with height_shift_range=1.0 possible values are floats in the interval [-1.0, +1.0).
          
      *  `brightness_range`::	Tuple or list of two floats. Range for picking a brightness shift value from.
      *  `shear_range`::	Float. Shear Intensity (Shear angle in counter-clockwise direction in degrees)
      *  `zoom_range`::	Float or [lower, upper]. Range for random zoom. If a float, [lower, upper] = [1-zoom_range, 1+zoom_range].
      *  `channel_shift_range`::	Float. Range for random channel shifts.
      *  `fill_mode`::	One of {"constant", "nearest", "reflect" or "wrap"}. Default is 'nearest'. Points outside the boundaries of the input are filled according to the given mode:
      
          -  'constant': kkkkkkkk|abcd|kkkkkkkk (cval=k)
          -  'nearest': aaaaaaaa|abcd|dddddddd
          -  'reflect': abcddcba|abcd|dcbaabcd
          -  'wrap': abcdabcd|abcd|abcdabcd
                          
      *  `cval`::	Float or Int. Value used for points outside the boundaries when fill_mode = "constant".
      *  `horizontal_flip`::	Boolean. Randomly flip inputs horizontally.
      *  `vertical_flip`::	Boolean. Randomly flip inputs vertically.
      *  `rescale`::	rescaling factor. Defaults to None. If None or 0, no rescaling is applied, otherwise we multiply the data by the value provided (after applying all other transformations).
      *  `preprocessing_function`::	function that will be applied on each input. 
          * The function will run after the image is resized and augmented. The function should take one argument: one image (Numpy tensor with rank 3), and should output a Numpy tensor with the same shape.
      *  `data_format`::	Image data format, either "channels_first" or "channels_last".
          * "channels_last" mode means that the images should have shape (samples, height, width, channels),
          * "channels_first" mode means that the images should have shape (samples, channels, height, width). 
          * It defaults to the image_data_format value found in your Keras config file at ~/.keras/keras.json. If you never set it, then it will be "channels_last".
      *  `validation_split`::	Float. Fraction of images reserved for validation (strictly between 0 and 1).
      *  `dtype`::	Dtype to use for the generated arrays.
---------------          
* 4- create the variable to load the data into:
    - assign the parameters you need inside
    - `target_size` resizes the data inside the variable, NOT THE REAL DATA
    - `batch_size` batches for uploading and tweaking the images
    - 'class_mode` depends on the classes type, is the data for binary classification or others
    - use `flow_from_directory` to Take the path to a directory & generate batches of augmented data.
            train_set= train_gen.flow_from_directory(
                                                        directory, target_size=(256, 256), color_mode='rgb', classes=None,
                                                        class_mode='categorical', batch_size=32, shuffle=True, seed=None,
                                                        save_to_dir=None, save_prefix='', save_format='png', follow_links=False,
                                                        subset=None, interpolation='nearest' )

    
* 5- do the same steps for the test_set or dev_set if there is a folder for it          
--------------



# Processing and Building Sequence models

## for sequences processing:

* 1- load the data as a tensorflow dataset object or as a NUMPY series DO NOT FORGET TO CONVERT THE VARIABLES

        dataset = tf.data.Dataset
----------------
* 2- for plotting the data:

        def plot_series(time, series, format="-", start=0, end=None):
            plt.plot(time[start:end], series[start:end], format)
            plt.xlabel("Time")
            plt.ylabel("Value")
            plt.grid(True)
            
        plt.figure(figsize=(10, 6))
        plot_series(time_valid, x_valid)
        plot_series(time_valid, results)
-----
### notice the coming codes are atributes of the dataset object AKA affecting the variable

* 3- create the window you will use to iterate over the data:

            dataset = dataset.window(5, shift=1, drop_remainder=True)
    - where 5 is the number of elements in one window
    - shift is the value with which the window will move each time - consider it as a stride -
    - drop_remainder is used to make all the windows of size 5, so when reaching last 4 elements then 3 then 2 untill the end, it will get rid of those and only keep windows of 5 values
    - flatten the window to map it , with the same number of elements in window, ' 5 ' 

            dataset = dataset.flat_map(lambda window: window.batch(5))
    
    - mapping/slicing each window inside the dataset variable into x , y , where x is input value and y is the wanted value    

            dataset = dataset.map(lambda window: (window[:-1], window[-1:]))

    - shuffling the windows:

            dataset = dataset.shuffle(buffer_size=10)
    - the buffer_size is the total number of elements we have ' in this case from 0 -> 9 = 10 elements '

    - batching the data as 2 inputs/outputs at a time

            dataset = dataset.batch(2).prefetch(1)

    - and now you can access the x , y values from the dataset variable as:
        
            for x , y in dataset:
     - we use a for loop because we are batching the data , so it will come out as batches of 2,  x = [ i1 , i2 ] , y = [ o1 , o2 ]
---------

* you can simply use this function to return a windowed dataset of a series data:

        split_time = 1000
        x_train = series_data[:split_time] # the data (training) before the split_time
        x_valid = series_data[split_time:] # the data (test/valid) after the split_time
        time_train = time[:split_time] # the time interval for the train
        time_valid = time[split_time:] # the time interval for the test 

        def windowed_dataset(series, window_size, batch_size, shuffle_buffer):
            series = tf.expand_dims(series, axis=-1)
            ds = tf.data.Dataset.from_tensor_slices(series)
            ds = ds.window(window_size + 1, shift=1, drop_remainder=True)
            ds = ds.flat_map(lambda w: w.batch(window_size + 1))
            ds = ds.shuffle(shuffle_buffer)
            ds = ds.map(lambda w: (w[:-1], w[1:]))
            return ds.batch(batch_size).prefetch(1)
          
         dataset = windowed_dataset(x_train, window_size, batch_size, shuffle_buffer_size) 
         # NOTICE that only the x_train goes into this function no further processing needed for any of the other params
         
  - notice the windows_size +1 is for easy use inside the layers, as the input layer will be of shape window_size 


        
---------
* 4- pass the data into the network of choice, make sure the input layer shape is the same as window_size
    
        layer.Dense(1 , input_shape=[window_size] )
    - if you are using rnns or conv1d then you will need to reshape the input, and u can do that easily by using Lambda layer:
    
        `tf.keras.layers.Lambda( lambda x: tf.expand_dims(x, axis=-1), input_shape=[None])`
        - you can also add the lamda function inside the preprocessing wondow function at the first line, to expand the last dim of the series data/x_train
    - The last layer must be Dense(1) with no activation
-------

* 5- Use learning rate scheduler to choose the right lr
        
            lr_schedule = tf.keras.callbacks.LearningRateScheduler( lambda epoch: lr * 10**(epoch / 20) ) # where lr is the lr in the optimizer
        
     - plot the lr vs loss to spot the best lr :
     
            lrs = lr * (10 ** (np.arange(100) / 20)) # you can change the range
            plt.semilogx(history.history["lr"], history.history["loss"])
            plt.axis([ lr , 1e-1 , 0 , 150 ]) # you can change the values of the first/sec for the start and end points , last axis for hight
-----

* 6- compile and fit the network as usual, make sure the loss is a linear one, 'mse' or 'rmse' or 'mae' or for high outliers use 'huber'
    - if you are training on conv or rnns make sure to multiply the output by factor of 100+ to enhance the performance
        `tf.keras.layers.Lambda( lambda x: x*100.0 )`
        
-------------

* 7- forecast the predictions:

        forecast=[]
        for time in range( len( series_data ) - window_size ): # iterating for the length of the whole data minus the windows_size as it is our step_size
            # appending the predictions of the slices of the series_data from Tn to Tn+step_size and organizing it's shape for the neural network
            # notice that the slices are on the whole series_data not just the test one
            forecast.append( model.predict( series_data[ time : time+window_size ] [np.newaxis] ) )
        forecast = forecast[ split_time - window_size : ] # where the forecast is the prediction of the whole data series
        #showing the output of the test data only, the unseen data output, from the split_time-window_size until the end of the data
        results = np.array(forecast)[:, 0, 0] # putting the forecast data in shape for plotting
        
        #plot the results
        plt.figure(figsize=(10, 6))
        plot_series(time_valid, x_valid)
        plot_series(time_valid, results)
-------------------

### To read the next row in a csv file 'in case the firstrow had titles', you use `next( )` and put the reader in it, then you can iterate through that reader variable without having the pain of having the first row useless as:
        with open('Sunspots.csv') as csvfile :
            ds=csv.reader(csvfile, delimiter=',')
            next(ds)
            for i in ds: #second row
            
-------------------



# Building Models in `tf.keras.models`

* 1: first you have to make sure it's tf.keras because it difers
--------------------------
### make sure to clear out variables before rerunning the model again using `tf.keras.backend.clear_session()` 
------------

* 2: assign Sequential to a variable 'model'
--------------------------
* 3: you can add layers by using `model.add( )` which adds 1 layer at a time after initializing the model

    - but we will build it for easier reading in a list inside Sequential
--------------------------
* 4: inside Sequential we add the layers we want in a list:

    - take a good care that every layer is from the same library, they are all from tf.keras not keras, you can `import tf.keras as keras` but keep it in mind that they are not the same in calling or initializing variables.
    
            model = tf.keras.models.Sequential([
                                                tf.keras.layers.Flatten(),
                                                tf. keras.layers.Dense(512, tf.nn.relu),
                                                tf.keras.layers.Dense(128 , tf.nn.tanh),
                                                tf.keras.layers.Dense(10 , tf.nn.softmax)
                                                ])
                                        

--------------------------
* 5: after initializing the model layers we compile the model it self:

    - the optimizer,loss,matrics all can be defined outside of the compile function to tweak their params, and gets passed in as variables

            model.compile(
                            optimizer='rmsprop', loss=None, metrics=None, loss_weights=None,
                            sample_weight_mode=None, weighted_metrics=None, **kwargs )

--------------------------
* 6: fitting the model by passing in different params:

    - the training set, training labels, epochs, batch_size, etc..
    
            history = model.fit(
                            x=None, y=None, batch_size=None, epochs=1, verbose=1, callbacks=None,
                            validation_split=0.0, validation_data=None, shuffle=True, class_weight=None,
                            sample_weight=None, initial_epoch=0, steps_per_epoch=None,
                            validation_steps=None, validation_batch_size=None, validation_freq=1,
                            max_queue_size=10, workers=1, use_multiprocessing=False )
--------------------------                           
* 7: evaluating the model:
    - using evaluate returns the loss value & metrics values for the model in test mode  
    
            model.evaluate(
                     x=None, y=None, batch_size=None, verbose=1, sample_weight=None, steps=None,
                     callbacks=None, max_queue_size=10, workers=1, use_multiprocessing=False,
                     return_dict=False )
--------------------------
* 8: to retrieve the output of only one layer:
    - Retrieves a layer based on either its name (unique) or index.
    
            model.get_layer( name=None, index=None )

--------------------------       
* 9: to make single prediction:
    - Generates output predictions for the input samples.
    
            model.predict(
                x, batch_size=None, verbose=0, steps=None, callbacks=None, max_queue_size=10,
                workers=1, use_multiprocessing=False )
--------------------------
* 10: to predict a batch of input:

           model.predict_on_batch( x )
--------------------------
* 11: to pritn a summary of the model archeticture:

            model.summary(
                line_length=None, positions=None, print_fn=None )

--------------------------
* 12: to save the model:
    - Saves the model to Tensorflow SavedModel or a single HDF5 file.

           model. save(
                filepath, overwrite=True, include_optimizer=True, save_format=None,
                signatures=None, options=None )
        - The savefile includes:
                1 The model architecture, allowing to re-instantiate the model.
                2 The model weights.
                3 The state of the optimizer, allowing to resume training exactly where you left off.
--------------------------
* 13: to load the whole model:
    - Saved models can be reinstantiated by load_model is a compiled model ready to be used (unless the saved model was never compiled in the first place).

    - Models built with the Sequential and Functional API can be saved to both the HDF5 and SavedModel formats. Subclassed models can only be saved with the SavedModel format.
    
            model_var = keras.models.load_model( )
--------------------------
* 14: to Save weights of the model layers:
    - Saves all layer weights.
      Either saves in HDF5 or in TensorFlow format based on the save_format argument.

            model.save_weights( filepath, overwrite=True, save_format=None )
         - When saving in HDF5 format, the weight file has:

            1 layer_names (attribute), a list of strings (ordered names of model layers).
            2 For every layer, a group named layer.name
            3 For every such layer group, a group attribute weight_names, a list of strings (ordered names of weights tensor of the layer).
            4 For every weight in the layer, a dataset storing the weight value, named after the weight tensor.
--------------------------            
* 15: to Load weights into the model layers:
    - Loads all layer weights, either from a TensorFlow or an HDF5 weight file.
    
            model.load_weights( filepath, by_name=False, skip_mismatch=False )
            
        - If by_name is False weights are loaded based on the network's topology. This means the architecture should be the same as when the weights were saved. Note that layers that don't have weights are not taken into account in the topological ordering, so adding or removing layers is fine as long as they don't have weights.

        - If by_name is True, weights are loaded into layers only if they share the same name. This is useful for fine-tuning or transfer-learning models where some of the layers have changed.
--------------------------
* 16: to save and load in json format:
    - Returns a JSON string containing the network configuration.
    
            model.to_json( **kwargs )

    - To load a network from a JSON save file, use 
    
            model_var= keras.models.model_from_json(json_string, custom_objects={})
--------------------------            
* 17: to save and load in yml format:
    - Returns a yaml string containing the network configuration
    
            to_yaml( **kwargs )

    - To load a network from a yaml save file, use 
    
            keras.models.model_from_yaml(yaml_string, custom_objects={})

        - custom_objects should be a dictionary mapping the names of custom losses / layers / etc to the corresponding functions / classes.
--------------------------     
* 18: you can use the model variable you assigned the model to ' history' to retrieve more infos about the model:
    - like using `history.history['acc']` , `history.history['val_acc']`, `history.history['loss']` , etc.
--------------------


# Layers in `tf.keras.layers`

* `Conv2D()`
    - don't forget to reshape the images from 3D(m,w,h) to 4D (m,w,h,c) c for channel , by using np.expand_dims
    - and dont forget to set the `input_shape(w,h,c)` as a parameter of the conv2d layer
    - the parameter `padding='_'` if the value is 'same' then the output from the conv will be of same size as the input
    - if `padding='valid'` it means the image will shrunk by the formula ( (x-f)/s )+1 where x is the size of image, f for kernel , s for stride
---------------------------------------------------

* `MaxPool2D()` 

    - dont forget that it shrinks the size depending on the stride, same as the formula for cnv2d
---------------------------------------------------
* `Flatten()` 

    - Flattens the input. Does not affect the batch size.
-----------------------------------------------------
* `Dense()`

    - acts as a fully connected Nuraon layer
---------------------------------------------
* `Dropout()`
    - The Dropout layer randomly sets input units to 0 with a frequency of rate at each step during training time, which helps prevent overfitting
--------------------
* `BatchNormalization()`
    - Normalize the activations of the previous layer at each batch, i.e. applies a transformation that maintains the mean activation close to 0 and the activation standard deviation close to 1.
----------------------
* `Embedding()`
    - Turns positive integers (indexes) into dense vectors of fixed size.
-------------------
* `RNN()`
    - Base class for recurrent layers.
-------------
* `LSTM()`
    - Long Short-Term Memory layer - Hochreiter 1997.
--------------
* `GRU()`
    - Gated Recurrent Unit - Cho et al. 2014.
-------------
* `Bidirectional( )`
    - Bidirectional wrapper for RNNs, you put inside it the layers you want to be duplicated in reverse direction
------------------
* `Lambda()`   

    - used to perform the function at the first layer of the model, like changing the shape of the input etc. where the variable in the function referes to the input to that layer
    - takes the function as a lambda and the input_shape as desired
    
            Lambda( lambda x: tf.expand_dims(x, axis=-1), input_shape=[None])
       - notice the none value makes it take any kind of shape
-----------


# Optimizers in `tf.keras.optimizers`

* `Adam()`: Optimizer that implements the Adam algorithm.

        tf.keras.optimizers.Adam( learning_rate=0.001, beta_1=0.9, beta_2=0.999, epsilon=1e-07, amsgrad=False )
---------------
* `RMSprop()`: Optimizer that implements the RMSprop algorithm.

        tf.keras.optimizers.RMSprop( learning_rate=0.001, rho=0.9, momentum=0.0, epsilon=1e-07, centered=False )
---------------
* `SGD()` : Stochastic gradient descent and momentum optimizer.

        tf.keras.optimizers.SGD( learning_rate=0.01, momentum=0.0, nesterov=False )
----------------
* `Adamax()`: Optimizer that implements the Adamax algorithm.

        tf.keras.optimizers.Adamax( learning_rate=0.001, beta_1=0.9, beta_2=0.999, epsilon=1e-07 )
-----------------
* `Nadam()`: Optimizer that implements the NAdam algorithm.

        tf.keras.optimizers.Nadam( learning_rate=0.001, beta_1=0.9, beta_2=0.999, epsilon=1e-07 )
-------------------        
* `Adagrad()`: Optimizer that implements the Adagrad algorithm.

        tf.keras.optimizers.Adagrad( learning_rate=0.001, initial_accumulator_value=0.1, epsilon=1e-07 )
--------------------        
* `Adadelta()`: Optimizer that implements the Adadelta algorithm.

        tf.keras.optimizers.Adadelta( learning_rate=0.001, rho=0.95, epsilon=1e-07 )
---------------------
* `Ftrl()`: Optimizer that implements the FTRL algorithm.

        tf.keras.optimizers.Ftrl(
            learning_rate=0.001, learning_rate_power=-0.5, initial_accumulator_value=0.1,
            l1_regularization_strength=0.0, l2_regularization_strength=0.0, name='Ftrl',
            l2_shrinkage_regularization_strength=0.0, **kwargs )

# Metrics in `tf.keras.metrics`

### You can add metrics in a list and pass it to the mertrics variable in compile
---------------
* `Accuracy()` : Calculates how often predictions equals labels.
-----------------------
* `BinaryAccuracy()` : Calculates how often predictions matches binary labels.
-----------------------
* `BinaryCrossentropy()`: Computes the crossentropy metric between the labels and predictions.
-----------------------
* `CategoricalAccuracy()`: Calculates how often predictions matches one-hot labels.
-----------------------
* `CategoricalCrossentropy()` : Computes the crossentropy metric between the labels and predictions.
-----------------------
* `CosineSimilarity()` : Computes the cosine similarity between the labels and predictions.
-----------------------
* `FalseNegatives()` : Calculates the number of false negatives.
-----------------------
* `FalsePositives()` : Calculates the number of false positives.
-----------------------
* `TrueNegatives()` : Calculates the number of true negatives.
-----------------------
* `TruePositives()` : Calculates the number of true positives.
-----------------------
* `KLDivergence()` : Computes Kullback-Leibler divergence metric between y_true and y_pred.
-----------------------
* `LogCoshError()` : Computes the logarithm of the hyperbolic cosine of the prediction error.
-----------------------
* `MeanAbsoluteError()` : Computes the mean absolute error between the labels and predictions.
-----------------------
* `MeanAbsolutePercentageError()` : Computes the mean absolute percentage error between y_true and y_pred.
-----------------------
* `MeanIoU()` : Computes the mean Intersection-Over-Union metric.
-----------------------
* `MeanRelativeError()` : Computes the mean relative error by normalizing with the given values.
-----------------------
* `MeanSquaredError()` : Computes the mean squared error between y_true and y_pred.
-----------------------
* `MeanSquaredLogarithmicError()` : Computes the mean squared logarithmic error between y_true and y_pred.
-----------------------
* `MeanTensor()` : Computes the element-wise (weighted) mean of the given tensors.
-----------------------
* `Metric()` : Encapsulates metric logic and state.
-----------------------
* `Poisson()` : Computes the Poisson metric between y_true and y_pred.
-----------------------
* `Precision()` : Computes the precision of the predictions with respect to the labels.
-----------------------
* `PrecisionAtRecall()` : Computes the precision at a given recall.
-----------------------
* `Recall()` : Computes the recall of the predictions with respect to the labels.
-----------------------
* `RecallAtPrecision()` : Computes the maximally achievable recall at a required precision.
-----------------------
* `RootMeanSquaredError()` : Computes root mean squared error metric between y_true and y_pred.
-----------------------
* `SensitivityAtSpecificity()` : Computes the sensitivity at a given specificity.* `
-----------------------
* `SparseCategoricalAccuracy()` : Calculates how often predictions matches integer labels.
-----------------------
* `SparseCategoricalCrossentropy()` : Computes the crossentropy metric between the labels and predictions.
-----------------------
* `SparseTopKCategoricalAccuracy()` : Computes how often integer targets are in the top K predictions.
-----------------------
* `SpecificityAtSensitivity()` : Computes the specificity at a given sensitivity.
-----------------------
* `SquaredHinge()` : Computes the squared hinge metric between y_true and y_pred.
-----------------------
* `TopKCategoricalAccuracy()` : Computes how often targets are in the top K predictions.
-----------------------


# Losses in `tf.keras.losses`

* `BinaryCrossentropy()` : Computes the cross-entropy loss between true labels and predicted labels.
--------------------
* `CategoricalCrossentropy()` : Computes the crossentropy loss between the labels and predictions.
--------------------
* `CategoricalHinge()` : Computes the categorical hinge loss between y_true and y_pred.
--------------------
* `CosineSimilarity()` : Computes the cosine similarity between y_true and y_pred.
--------------------
* `Hinge()` : Computes the hinge loss between y_true and y_pred.
--------------------
* `Huber()` : Computes the Huber loss between y_true and y_pred. Works well with OUTLIERS
--------------------
* `KLDivergence()` : Computes Kullback-Leibler divergence loss between y_true and y_pred.
--------------------
* `LogCosh()` : Computes the logarithm of the hyperbolic cosine of the prediction error.
--------------------
* `Loss()` : Loss base class.
--------------------
* `MeanAbsoluteError()` : Computes the mean of absolute difference between labels and predictions.
--------------------
* `MeanAbsolutePercentageError()` : Computes the mean absolute percentage error between y_true and y_pred.
--------------------
* `MeanSquaredError()` : Computes the mean of squares of errors between labels and predictions.
--------------------
* `MeanSquaredLogarithmicError()` : Computes the mean squared logarithmic error between y_true and y_pred.
--------------------
* `Poisson()` : Computes the Poisson loss between y_true and y_pred.
--------------------
* `Reduction()` : Types of loss reduction.
--------------------
* `SparseCategoricalCrossentropy()` : Computes the crossentropy loss between the labels and predictions.
--------------------
* `SquaredHinge()` : Computes the squared hinge loss between y_true and y_pred.
--------------------

# Initializers `in tf.keras.initializers`

* `Constant()` : Initializer that generates tensors with constant values.
---------------
* `GlorotNormal()` : The Glorot normal initializer, also called Xavier normal initializer.
---------------
* `GlorotUniform()` : The Glorot uniform initializer, also called Xavier uniform initializer.
---------------
* `Identity()` : Initializer that generates the identity matrix.
---------------
* `Initializer()` : Initializer base class: all initializers inherit from this class.
---------------
* `Ones()` : Initializer that generates tensors initialized to 1.
---------------
* `Orthogonal()` : Initializer that generates an orthogonal matrix.
---------------
* `RandomNormal()` : Initializer that generates tensors with a normal distribution.
---------------
* `RandomUniform()` : Initializer that generates tensors with a uniform distribution.
---------------
* `TruncatedNormal()` : Initializer that generates a truncated normal distribution.
---------------
* `VarianceScaling()` : Initializer capable of adapting its scale to the shape of weights tensors.
---------------
* `Zeros()` : Initializer that generates tensors initialized to 0.
---------------
* `constant()` : Initializer that generates tensors with constant values.
---------------
* `glorot_normal()` : The Glorot normal initializer, also called Xavier normal initializer.
---------------
* `glorot_uniform()` : The Glorot uniform initializer, also called Xavier uniform initializer.
---------------
* `identity()` : Initializer that generates the identity matrix.
---------------
* `ones()` : Initializer that generates tensors initialized to 1.
---------------
* `orthogonal()` : Initializer that generates an orthogonal matrix.
---------------
* `zeros()` : Initializer that generates tensors initialized to 0.
---------------

# Callbacks in `tf.keras.callbacks`

* `BaseLogger()` : Callback that accumulates epoch averages of metrics.

        tf.keras.callbacks.BaseLogger( stateful_metrics=None )
---------------------
* `CSVLogger()` : Callback that streams epoch results to a csv file.

        tf.keras.callbacks.CSVLogger( filename, separator=',', append=False )
---------------------
* `Callback()` : Abstract base class used to build new callbacks.

        tf.keras.callbacks.Callback()

    - The logs dictionary that callback methods take as argument will contain keys for quantities relevant to the current batch or epoch.

    - Currently, the .fit() method of the Model class will include the following quantities in the logs that it passes to its callbacks:

        - on_epoch_end: logs include `acc` and `loss`, and
            optionally include `val_loss`
            (if validation is enabled in `fit`), and `val_acc`
            (if validation and accuracy monitoring are enabled).
        - on_batch_begin: logs include `size`,
            the number of samples in the current batch.
        - on_batch_end: logs include `loss`, and optionally `acc`
            (if accuracy monitoring is enabled).
    
---------------------
* `EarlyStopping()` : Stop training when a monitored metric has stopped improving.

        tf.keras.callbacks.EarlyStopping( monitor='val_loss', min_delta=0, patience=0, 
                                          verbose=0, mode='auto', baseline=None, restore_best_weights=False )

---------------------
* `LambdaCallback()` : Callback for creating simple, custom callbacks on-the-fly.
        
        tf.keras.callbacks.LambdaCallback( on_epoch_begin=None, on_epoch_end=None, 
                                            on_batch_begin=None, on_batch_end=None, 
                                            on_train_begin=None, on_train_end=None, **kwargs )
---------------------
* `LearningRateScheduler()` : Learning rate scheduler.
        
        tf.keras.callbacks.LearningRateScheduler( schedule, verbose=0 )


---------------------
* `ModelCheckpoint()` : Callback to save the Keras model or model weights at some frequency.

        tf.keras.callbacks.ModelCheckpoint(
                                            filepath, monitor='val_loss', verbose=0, save_best_only=False,
                                            save_weights_only=False, mode='auto', save_freq='epoch', **kwargs )
---------------------
* `ProgbarLogger()` : Callback that prints metrics to stdout.
            
        tf.keras.callbacks.ProgbarLogger( count_mode='samples', stateful_metrics=None )
---------------------
* `ReduceLROnPlateau()` : Reduce learning rate when a metric has stopped improving.

        tf.keras.callbacks.ReduceLROnPlateau(   monitor='val_loss', factor=0.1, patience=10, verbose=0, mode='auto',
                                                min_delta=0.0001, cooldown=0, min_lr=0, **kwargs )
---------------------
* `RemoteMonitor()` : Callback used to stream events to a server.

        tf.keras.callbacks.RemoteMonitor(   root='http://localhost:9000', path='/publish/epoch/end/', 
                                            field='data', headers=None, send_as_json=False )
---------------------
* `TensorBoard()` : Enable visualizations for TensorBoard.

        tf.keras.callbacks.TensorBoard( log_dir='logs', histogram_freq=0, write_graph=True, write_images=False,
                                        update_freq='epoch', profile_batch=2, embeddings_freq=0,
                                        embeddings_metadata=None, **kwargs )
---------------------
* `TerminateOnNaN()` : Callback that terminates training when a NaN loss is encountered.
        
        tf.keras.callbacks.TerminateOnNaN()
---------------------

# Activations in `tf.keras.activations`

* `deserialize(...)` : Returns activation function denoted by input string.
----------------------

* `elu(...)` : Exponential linear unit.
----------------------

* `exponential(...)` : Exponential activation function.
----------------------

* `get(...)` : Returns function.
----------------------

* `hard_sigmoid(...)` : Hard sigmoid activation function.
----------------------

* `linear(...)` : Linear activation function.
----------------------

* `relu(...)` : Applies the rectified linear unit activation function.
----------------------

* `selu(...)` : Scaled Exponential Linear Unit (SELU).
----------------------

* `serialize(...)` : Returns name attribute (__name__) of function.
----------------------

* `sigmoid(...)` : Sigmoid activation function.
----------------------

* `softmax(...)` : Softmax converts a real vector to a vector of categorical probabilities.
----------------------

* `softplus(...)` : Softplus activation function.
----------------------

* `softsign(...)` : Softsign activation function.
----------------------

* `swish(...)` : Swish activation function.
----------------------

* `tanh(...)` : Hyperbolic tangent activation function.
----------------------

# Transfere Learning


* 1- import the model you want:

        from tensorflow.keras import Model

        from tensorflow.keras.applications.inception_v3 import InceptionV3

        local_weights_file = '/tmp/inception_v3_weights_tf_dim_ordering_tf_kernels_notop.h5' # if you have the weights you want downloaded already

        pre_trained_model = InceptionV3(input_shape = (150, 150, 3),include_top = False, pooling=None , weights = None )
-----------       
* 2- decide what to do with the weights:
    - `weights`:	one of None (random initialization), 'imagenet' (pre-training on ImageNet), or the path to the weights file to be loaded.
    - if you choose to train the model from scratch then use 
        `classes:	optional number of classes to classify images into, only to be specified if include_top is True, and if no weights argument is specified.
--------------
* 3- if you chose to add your own layers:

        #freeze the unwanted layers
        for layer in pre_trained_model.layers:
          layer.trainable = False 
        
        # print the summary to see the layers and their dimentions and names
        pre_trained_model.summary()

        last_layer = pre_trained_model.get_layer('mixed7')        # here we named the layer to cut at
        last_layer= pre_trained_model.git_layer(index=-1)         # or you can use this to refere to the last layer
        print('last layer output shape: ', last_layer.output_shape) # to print the shape -not essential-
        last_output = last_layer.output
    - you can choose any layer you want 'from the summary' and put it as the last_output, you can do that to reduce the params or even cut the model where you want to start attaching your new layers and get rid of the unwanted layers/features after the last_output
 
-----------------
* 4- Building the new layers of our own but using tensorflow way, not `tf.keras.model.Sequential`:

        # Flatten the output layer to 1 dimension
        x = layers.Flatten()(last_output) # using the output of the pretrained model as the input to our new layers
        x = layers.Dense(1024, activation='relu')(x)
        x = layers.Dropout(0.2)(x)                  
        x = layers.Dense  (1, activation='sigmoid')(x)           
        
        # specifing the input of the new_model 'pretrained model input' and the output 'x that we created'
        new_model = Model( pre_trained_model.input, x) 
----------
* 5- then compile and fit as usual.
-----------

# Helpful functions in `tf.nn`

### Normalization:
- 1 `batch_norm_with_global_normalization(...)`: Batch normalization.
- 2 `batch_normalization(...)`: Batch normalization.
------------
### Loss:
- 1 `l2_loss(...)`: L2 Loss.
- 2 `l2_normalize(...)`: Normalizes along dimension axis using an L2 norm.
- 3 `softmax_cross_entropy_with_logits(...)`: Computes softmax cross entropy between logits and labels.
- 4 `sparse_softmax_cross_entropy_with_logits(...)`: Computes sparse softmax cross entropy between logits and labels.
- 5 `weighted_cross_entropy_with_logits(...)`: Computes a weighted cross entropy.
-----------------
### Activations:
- 1 `leaky_relu(...)`: Compute the Leaky ReLU activation function.
- 2 `log_softmax(...)`: Computes log softmax activations.
- 3 `relu(...)`: Computes rectified linear: max(features, 0).
- 4 `sigmoid(...)`: Computes sigmoid of x element-wise.
- 5 `softmax(...)`: Computes softmax activations.
- 6 `tanh(...)`: Computes hyperbolic tangent of x element-wise.
---------------
### Sequences stuff:
- 1 `embedding_lookup(...)`: Looks up ids in a list of embedding tensors.
- 2 `embedding_lookup_sparse(...)`: Computes embeddings for the given ids and weights.
- 3 `ctc_beam_search_decoder(...)`: Performs beam search decoding on the logits given in input.
- 4 `ctc_greedy_decoder(...)`: Performs greedy decoding on the logits given in input (best path).
- 5 `ctc_loss(...)`: Computes CTC (Connectionist Temporal Classification) loss.
- 6 `ctc_unique_labels(...)`: Get unique labels and indices for batched labels for tf.nn.ctc_loss.
--------------------
### Others
- 1 `top_k(...)`: Finds values and indices of the k largest entries for the last dimension.
- 2 `in_top_k(...)`: Says whether the targets are in the top K predictions.
- 3 

# Customized callback in Keras



* 1: define a class with an input of keras.callbacks.Callback :

        class myCallback(tf.keras.callbacks.Callback):
-------
* 2: define the function that should run when the epoch ends `on_epoch_end`  :
    - DO NOT EVER CHANGE THE NAME OF THE FUNCTION, u can change the name of the class BUT NOT THIS FUNCTION. ty, maybe there is on_epoch_start func?
    - dont forget to give it the first input as `self` , and the epochs for epoch number, and the logs dictionary
    - the logs dict contains the loss and accuracy
    
            def on_epoch_end(self,epochs,logs={}):
---------
* 3: define the calculations you want inside that function:
     - it can be a simple print command or conditions or other commands liek save weights etc..
     
                if logs.get('acc') >=99:
                    print('Reached 99% accuracy so cancelling training!')
     - to stop the training you can simply assign the value `True` to `model.stop_training` 
     
                    model.stop_training=True
---------
* 4: assign the value for the class to a variable, that you will insert into callbacks in the fit method
        `cb=myCallback()`
        `model.fit( x,y,epoch=1, callbacks=[cb] )`

# Prediction on images

* 1- importing the image function

        from tf.keras.preprocessing import image
* 2- setting the path of the image

        img_path='path'
* 3- uploading the image into a variable

        img= image.load_img( path , target_size=( , ) )
  - don't forget the target size the model is expecting
* 4- processing the image variable to suit the model

        x= image.img_to_array( img )
        x= np.expand_dims( x , axis=0 )
        images= np.vstack( [x] )
    - then feed the variable images into the predict fun.

# Plot Loss and Accuracy

    import matplotlib.pyplot as plt
    acc = history.history['acc']
    val_acc = history.history['val_acc']
    loss = history.history['loss']
    val_loss = history.history['val_loss']

    epochs = range(len(acc))

    plt.plot(epochs, acc, 'r', label='Training accuracy')
    plt.plot(epochs, val_acc, 'b', label='Validation accuracy')
    plt.title('Training and validation accuracy')

    plt.figure()

    plt.plot(epochs, loss, 'r', label='Training Loss')
    plt.plot(epochs, val_loss, 'b', label='Validation Loss')
    plt.title('Training and validation loss')
    plt.legend()

    plt.show()
        

===========================================================================================================================================================
* Another Example:

            %matplotlib inline
            import matplotlib.image  as mpimg
            import matplotlib.pyplot as plt`

-----------------------------------------------------------
* Retrieve a list of list results on training and test data
* sets for each training epoch
-----------------------------------------------------------

            acc=history.history['acc']
            val_acc=history.history['val_acc']
            loss=history.history['loss']
            val_loss=history.history['val_loss']

            epochs=range(len(acc)) # Get number of epochs

------------------------------------------------
* Plot training and validation accuracy per epoch
------------------------------------------------

            plt.plot(epochs, acc, 'r', "Training Accuracy")
            plt.plot(epochs, val_acc, 'b', "Validation Accuracy")
            plt.title('Training and validation accuracy')
            plt.figure()

------------------------------------------------
* Plot training and validation loss per epoch
------------------------------------------------

            plt.plot(epochs, loss, 'r', "Training Loss")
            plt.plot(epochs, val_loss, 'b', "Validation Loss")
            plt.title('Training and validation loss')
---------------------------------------------------------------------
* Desired output. Charts with training and validation metrics. No crash :)
===========================================================================================================================================================
* Another Plotting method to zoom into a desired area in the plot:

        import matplotlib.image  as mpimg
        import matplotlib.pyplot as plt

#-----------------------------------------------------------
### Retrieve a list of list results on training and test data
### sets for each training epoch
#-----------------------------------------------------------

        loss=history.history['loss']

        epochs=range(len(loss)) # Get number of epochs


#------------------------------------------------
### Plot training and validation loss per epoch
#------------------------------------------------

        plt.plot(epochs, loss, 'r')
        plt.title('Training loss')
        plt.xlabel("Epochs")
        plt.ylabel("Loss")
        plt.legend(["Loss"])

        plt.figure()



        zoomed_loss = loss[200:]
        zoomed_epochs = range(200,500)


#------------------------------------------------
### Plot training and validation loss per epoch
#------------------------------------------------

        plt.plot(zoomed_epochs, zoomed_loss, 'r')
        plt.title('Training loss')
        plt.xlabel("Epochs")
        plt.ylabel("Loss")
        plt.legend(["Loss"])

        plt.figure()




# Clearing Data using python

* 1- To Filters the images by the size, if the image has 0 size it will not be moved to the destination:

        def clear_data(data_path, destination_path ):

            files= os.listdir(data_path) 
            valid=[]
            for i in files:
                path= data_path +'/'+ i

                if os.path.getsize(path): #checks if the size of the image has a value and not NaN
                    valid.append(i)

            n_valid = len(valid)

            for t in valid:
                copyfile( data_path +'/'+t , destination_path +'/'+ t) 

        #data_path 
        CAT_SOURCE_DIR = r'M:\Courses\Coursera-2020\TensorFlow-in-Practice-Specialization\2 convolutional-neural-networks-tensorflow\Codes\CatsvsDogs\Cat' 
        DOG_SOURCE_DIR = r'M:\Courses\Coursera-2020\TensorFlow-in-Practice-Specialization\2 convolutional-neural-networks-tensorflow\Codes\CatsvsDogs\Dog' 

        #destination_path : MAKE SURE YOU HAVE CREATED THAT USING os.mkdir( path_you_want )
        cat_dest= r'M:\Courses\Coursera-2020\TensorFlow-in-Practice-Specialization\2 convolutional-neural-networks-tensorflow\Codes\CatsvsDogs\Catc'
        dog_dest= r'M:\Courses\Coursera-2020\TensorFlow-in-Practice-Specialization\2 convolutional-neural-networks-tensorflow\Codes\CatsvsDogs\Dogc'

        clear_data(CAT_SOURCE_DIR, cat_dest )
        clear_data(DOG_SOURCE_DIR, dog_dest )
---------------------
* 2- To split the data on a directory into training and testing :

        def split_data(SOURCE, TRAINING, TESTING, SPLIT_SIZE):

            files= os.listdir(SOURCE) 
            valid=[]
            for i in files:
                path= SOURCE + i

                if os.path.getsize(path) > 0:
                    valid.append(i)

            n_valid = len(valid)


            split = int(n_valid * SPLIT_SIZE)

            shuffled = random.sample(valid , n_valid)

            train_set = shuffled[:split]
            test_set = shuffled[split:]

            for t in train_set:
                copyfile(SOURCE + t , TRAINING + t)

            for s in test_set:
                copyfile(SOURCE + s, TESTING + s)

        CAT_SOURCE_DIR = "/tmp/PetImages/Cat/"
        TRAINING_CATS_DIR = "/tmp/cats-v-dogs/training/cats/"
        TESTING_CATS_DIR = "/tmp/cats-v-dogs/testing/cats/"
        split_size = .9
        split_data(CAT_SOURCE_DIR, TRAINING_CATS_DIR, TESTING_CATS_DIR, split_size)
------------

* 3- To iterate through json text data:

        import json

        with open("/tmp/sarcasm.json", 'r') as f:
            datastore = json.load(f)

        # assuming the data in this file is a dictionary having 3 keys ( 'headline', 'is_sarcastic' , 'article_link' )
        sentences = [] 
        labels = []
        urls = []
        for item in datastore:
            sentences.append(item['headline'])
            labels.append(item['is_sarcastic'])
            urls.append(item['article_link'])

------------






-------
* to reverse a dictionary:

        reverse= dict( [(value,key) for (key,value) in dict] ) notice the ' , 'between the tubles we are putting each into list
        #or
        variable= {u:i for i, u in enumerate(vocab)} #notice the ' : ' between the key and value


# Directory/Path tricks


## Notice: if the model is not learning and accuracy is not increasing, then the problem will be from the dataset, something wrong with the path or folders in it, make sure you check the '/' of the path carefully
---------------------------------
## Make sure your directory is correctly written
### it should be like:
`dir=r'M:\Courses\Coursera-2020\TensorFlow-in-Practice-Specialization\2 convolutional-neural-networks-tensorflow\Codes\CatsvsDogs'+'/'`
### we add the r to make sure the backslashes are correct for the path
--------------------------------

    # Define our example directories path -- the variable points at the folder not it's inside
    train_dir = '/tmp/training'
    validation_dir = '/tmp/validation'
    
    # Define the inner directories path -- the variable points at the folder not it's inside
    train_horses_dir = '/tmp/training/horses'
    train_humans_dir = '/tmp/training/humans'
    validation_horses_dir = '/tmp/validation/horses'
    validation_humans_dir = '/tmp/validation/humans'
    
    # to access the names of files inside the folders you need to provide extra '/' to look inside
    train_horses_fnames = os.listdir(train_horses_dir + '/' ) 
    train_humans_fnames = os.listdir(train_humans_dir + '/' ) 
    validation_horses_fnames = os.listdir( validation_horses_dir + '/') 
    validation_humans_fnames = os.listdir(validation_humans_dir + '/' ) 

# NLP related


### Steps:
#### Dont forget to create variables for the most important params in the tokenizer or embedding layer like:
            words = 10000 # Maximum number of words to be tokenized, and picks the most common ‘n’ words
            e_dim = 16 # Number of dimensions for the vector representing the word encoding
            max_length = 100
            trunc_type='post'
            padding_type='post'
            oov_tok = "<OOV>"
            training_size = 20000
* 1- get the text data from the source or input

* 2- create the tokenizer variable with the params needed 

* 3- fit the text to the tokenizer 

* 4- create the variable word_to_idx dictionary from the tokenizer

#### for the LABELS data and TESTING data you will repeat the next two points

* 5- generate the sequences from the tokenizer using the text/labels and assign them to the variable sequences

* 6- pad the sequences to the desired length and then convert them into np.array

* 7- build the model and pass the padded seq. as an input to the embedded layer as the first layer in the model

* 8-in case you will use regular dense layers after the embedding layer use flatten layer or `GlobalAveragePooling1D()` to flatten the embeddings and feed them to the dense layers

* 9- in case of using lstm you don't need to flatten the output of the embedding layer, make sure to return sequences from the lstm to the next one if you are going to stack them, with the last lstm NOT returning sequences

* 10-  in case you want to build a model to predict a word check the week 4 of tensorflow in practice codes 

-----------
* Tokenizer

        tf.keras.preprocessing.text.Tokenizer( num_words=None, oov_token=None , lower=True, split=' ', char_level=False, document_count=0 )
        
    - split: str. Separator for word splitting.
    - char_level: if True, every character will be treated as a token.
    - oov_token: if given, it will be added to word_index and used to
            replace out-of-vocabulary words during text_to_sequence calls     

        * Updates internal vocabulary based on a list of sequences. # Put in mind that sequences are arrays of indices representing the words
            `fit_on_sequences( sequences )`
        * Updates internal vocabulary based on a list of texts.
            `fit_on_texts( texts )`
        * Converts a list of sequences into a Numpy matrix.
            `sequences_to_matrix( sequences, mode='binary' )
        * Transforms each sequence into a list of text.
            `sequences_to_texts( sequences )
        * Transforms each text in texts to a sequence of integers.
            ` texts_to_sequences( texts )`
        * Convert a list of texts to a Numpy matrix.
            `texts_to_matrix( texts, mode='binary' )`
     - Retuns a Dictionary with keys as the words and values are the index
            `tokenizer.word_index`
     - 
     
--------
* Padding:

            tf.keras.preprocessing.sequence.pad_sequences( sequences, maxlen=None, dtype='int32', padding='post', truncating='pre', value=0.0 )

   - padds the sequences with zeros to make them all of the same `maxlen` or if not specified it will padd them to the  length of the longest individual sequence.
   
----------
* Embedding:

            tf.keras.layers.Embedding(  input_dim= -vocab_size- , output_dim= -embedding_dim-  ,input_length= -max_length- , 
                                        embeddings_initializer='uniform' , activity_regularizer=None,
                                        embeddings_constraint=None, mask_zero=False, embeddings_regularizer=None )
    - to visualize the embedding layer:
       
            e = model.layers[0]
            weights = e.get_weights()[0]
            print(weights.shape) # shape: (-vocab_size-, -embedding_dim-)
            import io
            out_v = io.open('vecs.tsv', 'w', encoding='utf-8')
            out_m = io.open('meta.tsv', 'w', encoding='utf-8')
            for word_num in range(1, -vocab_size- ):
              word = reverse_word_index[word_num]
              embeddings = weights[word_num]
              out_m.write(word + "\n")
              out_v.write('\t'.join([str(x) for x in embeddings]) + "\n")
            out_v.close()
            out_m.close()
            #try:
            #  from google.colab import files
            #except ImportError:
            #  pass
            #else:
            #  files.download('vecs.tsv')
            #  files.download('meta.tsv')

     - this code will create the two files. To now render the results, go to the TensorFlow Embedding Projector on projector.tensorflow.org
     
--------     
* LSTM:
            
            tf.keras.layers.LSTM(
                                    units, activation='tanh', recurrent_activation='sigmoid', use_bias=True,
                                    kernel_initializer='glorot_uniform', recurrent_initializer='orthogonal',
                                    bias_initializer='zeros', unit_forget_bias=True, kernel_regularizer=None,
                                    recurrent_regularizer=None, bias_regularizer=None, activity_regularizer=None,
                                    dropout=0.0, recurrent_dropout=0.0, implementation=2, return_sequences=False,
                                    return_state=False, go_backwards=False, stateful=False, time_major=False,
                                    unroll=False )
--------
* Bidirectional:

            tf.keras.layers.Bidirectional( layer, merge_mode='concat', weights=None, backward_layer=None )
            
--------
* Conv1D:
            
            tf.keras.layers.Conv1D( filters, kernel_size, strides=1, padding='valid', activation=None )
            
-----------------

* MaxPooling1D:
            
            tf.keras.layers.MaxPooling1D(pool_size=4)
-------------
* GRU:
            
            tf.keras.layers.GRU(
                                   units, activation='tanh', recurrent_activation='sigmoid', use_bias=True,
                                   kernel_initializer='glorot_uniform', recurrent_initializer='orthogonal',
                                   bias_initializer='zeros', kernel_regularizer=None, recurrent_regularizer=None,
                                   bias_regularizer=None, activity_regularizer=None, kernel_constraint=None,
                                   recurrent_constraint=None, bias_constraint=None, dropout=0.0,
                                   recurrent_dropout=0.0, implementation=2, return_sequences=False,
                                   return_state=False, go_backwards=False, stateful=False, unroll=False,
                                   time_major=False, reset_after=True )
                                   
-----------

