<a href="https://colab.research.google.com/github/jay05Hawk/Keras_tutorial/blob/main/Keras_Notes.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#$\color{red}{\text{Keras}}$ 
Keras is a deep learning API written in Python, running on top of the machine learning platform TensorFlow. It was developed with a focus on enabling fast experimentation. Being able to go from idea to result as fast as possible is key to doing good research.

Keras is:

$\color{blue}{\text{Simple--}}$  but not simplistic. Keras reduces developer cognitive load to free you to focus on the parts of the problem that really matter.

$\color{blue}{\text{Flexible --}}$ Keras adopts the principle of progressive disclosure of complexity: simple workflows should be quick and easy, while arbitrarily advanced workflows should be possible via a clear path that builds upon what you've already learned.

$\color{blue}{\text{Powerful --}}$ Keras provides industry-strength performance and scalability: it is used by organizations and companies including NASA, YouTube, and Waymo.

##$\color{blue}{\text{Optimizers}}$

- SGD
- RMSprop
- Adam
- AdamW
- AdaDelta
- Ada Grade
- AdaMax
- AdaFactor
- Nadam
- Ftrl


$\color{blue}{\text{Usage with compile() & fit()}}$ 

An optimizer is one of the two arguments required for compiling a Keras model:

In [1]:
from tensorflow import keras
from tensorflow.keras import layers

model = keras.Sequential()
model.add(layers.Dense(64, kernel_initializer='uniform', input_shape=(10,)))
model.add(layers.Activation('softmax'))

opt = keras.optimizers.Adam(learning_rate=0.01)
model.compile(loss='categorical_crossentropy', optimizer=opt)

In [2]:
# pass optimizer by name: default parameters will be used
model.compile(loss='categorical_crossentropy', optimizer='adam')

#Usage in a custom training loop
When writing a custom training loop, you would retrieve gradients via a$\color{blue}{\text{tf.GradientTape}}$  instance, then call optimizer. $\color{blue}{\text{apply_gradients()}}$ to update your weights:

In [4]:
import tensorflow as tf
# Instantiate an optimizer.
optimizer = tf.keras.optimizers.Adam()

# Iterate over the batches of a dataset.
for x, y in dataset:
    # Open a GradientTape.
    with tf.GradientTape() as tape:
        # Forward pass.
        logits = model(x)
        # Loss value for this batch.
        loss_value = loss_fn(y, logits)

    # Get gradients of loss wrt the weights.
    gradients = tape.gradient(loss_value, model.trainable_weights)

    # Update the weights of the model.
    optimizer.apply_gradients(zip(gradients, model.trainable_weights))

#Learning rate decay / scheduling
You can use a learning rate schedule to modulate how the learning rate of your optimizer changes over time:

In [None]:
lr_schedule = keras.optimizers.schedules.ExponentialDecay(
    initial_learning_rate=1e-2,
    decay_steps=10000,
    decay_rate=0.9)
optimizer = keras.optimizers.SGD(learning_rate=lr_schedule)

###apply_gradients method
Arguments

- grads_and_vars: List of (gradient, variable) pairs.

- name: string, defaults to None. The name of the namescope to use when creating variables. If None, self.name will be used.

- skip_gradients_aggregation: If true, gradients aggregation will not be 
performed inside optimizer. Usually this arg is set to True when you write custom code aggregating gradients outside the optimizer.

- **kwargs: keyword arguments only used for backward compatibility.
Returns

A tf.Variable, representing the current iteration.

In [None]:
Optimizer.apply_gradients(
    grads_and_vars, name=None, skip_gradients_aggregation=False, **kwargs
)

In [None]:
Optimizer.variables()
# Returns variables of this optimizer.

#Losses
The purpose of loss functions is to compute the quantity that a model should seek to minimize during training.

Available losses
Note that all losses are available both via a class handle and via a function handle. The class handles enable you to pass configuration arguments to the constructor (e.g. loss_fn = CategoricalCrossentropy(from_logits=True)), and they perform reduction by default when used in a standalone way (see details below).

 $\color{blue}{\text{Probabilistic losses}}$ for classificaon problems
- BinaryCrossentropy class
- CategoricalCrossentropy class
- SparseCategoricalCrossentropy class
- Poisson class
- binary_crossentropy function
- categorical_crossentropy function
- sparse_categorical_crossentropy function
- poisson function
- KLDivergence class
- kl_divergence function

$\color{blue}{\text{Regression  losses}}$ For regression problem
- MeanSquaredError class
- MeanAbsoluteError class
- MeanAbsolutePercentageError class
- MeanSquaredLogarithmicError class
- CosineSimilarity class
- mean_squared_error function
- mean_absolute_error function
- mean_absolute_percentage_error function
- mean_squared_logarithmic_error function
- cosine_similarity function
- Huber class
- huber function
- LogCosh class
- log_cosh function

$\color{blue}{\text{Hinge losses for "maximum-margin" classification}}$
- Hinge class
- SquaredHinge class
- CategoricalHinge class
- hinge function
- squared_hinge function
- categorical_hinge function


#$\color{blue}{\text{Data loading  }}$
Keras data loading utilities, located in tf.keras.utils, help you go from raw data on disk to a tf.data.Dataset object that can be used to efficiently train a model.

These loading utilites can be combined with preprocessing layers to futher transform your input dataset before training.


[Read It for more info](https://keras.io/api/data_loading/)



#$\color{blue}{\text{KerasCV API  }}$
KerasCV is a toolbox of modular building blocks (layers, metrics, losses, data augmentation) that computer vision engineers can leverage to quickly assemble production-grade, state-of-the-art training and inference pipelines for common use cases such as image classification, object detection, image segmentation, image data augmentation, etc.

KerasCV Layers can be used independently, or with the keras.Model class. Layers implement specific self contained logic such as image data augmentation, regularization during training, and more!

KerasCV Metrics are used for model evaluation. They can be used for both train time and post training evaluation. They can be used independently, or as part of the standard Model.fit(), Model.evaluate(), Model.predict() flow.


[must Read It for more info](https://keras.io/api/keras_cv/)

###Layers
- Augmentation layers
- Preprocessing layers
- Regularization layers

###Models
- StableDiffusion image-generation model
- The RetinaNet model
- The FasterRCNN model
- EfficientNetV2 models
- DenseNet models

###Bounding box formats and utilities
- Bounding box formats
- Bounding box utilities

#$\color{blue}{\text{KerasNLP  }}$
KerasNLP is a toolbox of modular building blocks ranging from pretrained state-of-the-art models, to low-level Transformer Encoder layers. For an introduction to the library see the KerasNLP home page. For a high-level introduction to the API see our getting started guide.


[must Read It for more info](https://keras.io/api/keras_nlp/)

##Models
- Bert
- DistilBert
- Roberta
- XLMRoberta

##Tokenizers
- Tokenizer base class
- WordPieceTokenizer
- SentencePieceTokenizer
- BytePairTokenizer
- ByteTokenizer
- UnicodeCodepointTokenizer
- compute_word_piece_vocabulary function
- compute_sentence_piece_proto function

##Preprocessing Layers
- StartEndPacker layer
- MultiSegmentPacker layer
- RandomSwap layer
- RandomDeletion layer
- MaskedLMMaskGenerator layer


##Modeling Layers
- TransformerEncoder layer
- TransformerDecoder layer
- FNetEncoder layer
- PositionEmbedding layer
- SinePositionEncoding layer
- TokenAndPositionEmbedding layer
- MaskedLMHead layer

##Metrics
- Perplexity metric
- RougeL metric
- RougeN metric
- Bleu metric
- EditDistance metric

##Utils
- greedy_search function
- top_k_search function
- top_p_search function
- random_search function
- beam_search function

[Examples NLP](https://keras.io/examples/nlp/)