<a href="https://colab.research.google.com/github/rameshver43/Practice/blob/master/neural_networks.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**Neural Network**

Differents way to create neural networks:

1.   **Building from scratch**: This approach involves manually implementing the forward and backward propagation algorithms using a mathematical library like NumPy or a deep learning framework like TensorFlow or PyTorch. It provides the highest level of control but requires more effort and expertise.
2.   **Using deep learning frameworks**: Frameworks like TensorFlow, PyTorch, Keras, and Caffe provide high-level APIs that abstract away many implementation details, making it easier to build and train neural networks. These frameworks offer a variety of pre-defined layers, activation functions, and optimization algorithms that you can use to create your model.
3.   **Transfer learning**: Transfer learning involves leveraging pre-trained neural network models, such as those trained on large-scale image datasets like ImageNet. By utilizing the weights and architectures of these models, you can either fine-tune them on your specific task or use them as feature extractors by freezing the pre-trained layers and adding your own custom layers on top.
4.   **AutoML tools**: Automated Machine Learning (AutoML) platforms and tools, such as Google AutoML, H2O.ai, and Auto-Keras, provide a higher level of abstraction, automating the process of designing and training neural networks. These tools often use techniques like neural architecture search (NAS) to automatically discover the optimal network architecture for a given task.
5.  ***Using pre-built models***: Many deep learning frameworks offer pre-built models for specific tasks, such as image classification (e.g., VGG, ResNet) or natural language processing (e.g., BERT, GPT). These models are typically trained on large-scale datasets and can be readily used or fine-tuned for similar tasks without requiring extensive network design.





Different examples of creating models using TensorFlow:


*   **Sequential Model**: The Sequential model is a linear stack of layers. It's the simplest way to create a model in TensorFlow, suitable for most common use cases.



In [None]:
import tensorflow as tf

model = tf.keras.models.Sequential([
    tf.keras.layers.Dense(64, activation='relu', input_shape=(input_dim,)),
    tf.keras.layers.Dense(10, activation='softmax')
])

**Functional API**: The Functional API allows for more complex model architectures, including models with shared layers or multiple inputs/outputs.

In [None]:
import tensorflow as tf

input_1 = tf.keras.Input(shape=(input_dim,))
x = tf.keras.layers.Dense(64, activation='relu')(input_1)
x = tf.keras.layers.Dense(32, activation='relu')(x)
output = tf.keras.layers.Dense(10, activation='softmax')(x)

model = tf.keras.Model(inputs=input_1, outputs=output)


**Subclassing Model**: Subclassing the tf.keras.Model class provides the highest level of flexibility, allowing you to define custom forward pass computations and create complex architectures

In [None]:
import tensorflow as tf

class MyModel(tf.keras.Model):
    def __init__(self):
        super(MyModel, self).__init__()
        self.dense1 = tf.keras.layers.Dense(64, activation='relu')
        self.dense2 = tf.keras.layers.Dense(10, activation='softmax')

    def call(self, inputs):
        x = self.dense1(inputs)
        x = self.dense2(x)
        return x

model = MyModel()


**Pre-trained Models**: TensorFlow provides pre-trained models that you can load and use directly for tasks like image classification, object detection, and natural language processing.

In [None]:
import tensorflow as tf

base_model = tf.keras.applications.ResNet50(weights='imagenet', include_top=False)
# Add custom layers on top of the pre-trained model
x = base_model.output
x = tf.keras.layers.GlobalAveragePooling2D()(x)
output = tf.keras.layers.Dense(num_classes, activation='softmax')(x)

model = tf.keras.Model(inputs=base_model.input, outputs=output)


The main difference between creating a model using the Sequential model and using the Functional API in TensorFlow lies in the flexibility and complexity of the model architectures that can be constructed.


**Sequential Model**: The Sequential model is a linear stack of layers, where each layer has a single input tensor and produces a single output tensor. It is suitable for building simple models with a straightforward flow of data, where each layer feeds into the next sequentially.
With the Sequential model, you define the model architecture by adding layers in sequence using the .add() method. It is the simplest and most commonly used way to create models in TensorFlow, especially for tasks like image classification or simple feedforward neural networks.
The Sequential model is easy to use and understand, but it may not be suitable for more complex architectures with multiple inputs/outputs or shared layers.

**Functional API**: The Functional API in TensorFlow provides a more flexible and powerful way to create models, allowing for complex architectures with multiple inputs, multiple outputs, and shared layers.
With the Functional API, you define the model architecture by explicitly defining the input tensors and connecting layers using function calls. Each layer is treated as a function that takes inputs and produces outputs. This enables you to create models with branching, merging, or skipping connections.The Functional API allows for more intricate model designs, such as multi-task learning models, models with residual connections, or models with skip connections. It provides greater control over the flow of data through the model.Additionally, the Functional API allows you to create models that share layers, where the same layer can be used multiple times in different parts of the model. This is particularly useful in scenarios like siamese networks or when building complex models with subnetworks.

The compile() method in TensorFlow is used to configure the model for training. It takes several parameters to define the optimization process, loss function, and evaluation metrics. Here is a detailed explanation of the parameters:



In [None]:
model.compile(
    optimizer='adam',
    loss=None,
    metrics=None,
    loss_weights=None,
    weighted_metrics=None,
    run_eagerly=None
)


**optimizer**: The optimizer parameter specifies the optimization algorithm used to update the model's weights during training. Some commonly used optimizers are 'adam', 'sgd', 'rmsprop', and 'adagrad'. You can also use custom optimizer instances from the tf.keras.optimizers module or specify optimizer parameters such as learning rate.

**loss**: The loss parameter defines the loss function to be minimized during training. It quantifies how well the model predicts the output compared to the true target values. Common loss functions for different problem types include 'mean_squared_error', 'binary_crossentropy', 'categorical_crossentropy', and 'sparse_categorical_crossentropy'. You can also define custom loss functions using tf.keras.losses.

**metrics**: The metrics parameter specifies the evaluation metrics to be computed during training. Metrics are used to assess the model's performance and can include accuracy, precision, recall, F1-score, and others. You can pass a single metric as a string or a list of metrics. For example: metrics='accuracy' or metrics=['accuracy', 'precision'].

**loss_weights**: The loss_weights parameter allows you to assign different weights to different loss functions if your model has multiple outputs or multi-task learning. It is a list or dictionary specifying the weight for each loss. This parameter is useful when you want to emphasize or de-emphasize certain losses in the overall training objective.

**weighted_metrics**: The weighted_metrics parameter allows you to specify additional metrics to evaluate during training. Similar to the metrics parameter, it can be a single metric or a list of metrics. However, weighted metrics are calculated based on the sample weights, which can be useful when the training dataset is imbalanced.

**run_eagerly**: The run_eagerly parameter is a Boolean that indicates whether to execute the model eagerly or in graph mode. By default, TensorFlow executes models in graph mode for performance reasons. However, you can set run_eagerly=True to enable eager execution for debugging or dynamic models.



**optimizer:**
The optimizer parameter in compile() specifies the optimization algorithm used to update the model's weights during training. TensorFlow provides various optimizers, including:
'**adam**': Adam optimizer is a popular choice that adapts the learning rate for each parameter based on the estimates of first and second moments of the gradients.
'**sgd**': Stochastic Gradient Descent (SGD) optimizer updates the weights based on the gradient of the loss function with respect to the weights.
'rmsprop': RMSprop optimizer uses a moving average of squared gradients to normalize the updates.
'**adagrad**': Adagrad optimizer adapts the learning rate based on the historical gradient information for each parameter.
You can also use other optimizers provided by TensorFlow or even define custom optimizer instances using the tf.keras.optimizers module. For example:

In [None]:
from tensorflow.keras.optimizers import SGD

optimizer = SGD(learning_rate=0.01, momentum=0.9)
model.compile(optimizer=optimizer, ...)


**loss**:
The loss parameter in compile() defines the loss function to be minimized during training. The loss function quantifies the difference between the predicted output and the true target values. Different types of problems (e.g., regression, binary classification, multi-class classification) require different loss functions. Some commonly used loss functions include:
'**mean_squared_error**': Mean Squared Error (MSE) is often used for regression problems.
'**binary_crossentropy**': Binary Crossentropy is commonly used for binary classification problems.
'**categorical_crossentropy**': Categorical Crossentropy is used for multi-class classification with one-hot encoded target labels.
'**sparse_categorical_crossentropy**': Sparse Categorical Crossentropy is used for multi-class classification with integer target labels.
You can also define custom loss functions by subclassing the tf.keras.losses.Loss class or using functional-style loss functions. For example:

In [None]:
import tensorflow.keras.losses as losses

loss = losses.MeanSquaredError()
model.compile(loss=loss, ...)


**metrics**:
The metrics parameter in compile() specifies the evaluation metrics to be computed during training. Metrics are used to assess the model's performance and can include accuracy, precision, recall, F1-score, and more. Some common metrics for classification tasks include:
'**accuracy**': Computes the accuracy of the model's predictions.
'**precision**': Calculates the precision of the model's predictions.
'**recall**': Calculates the recall of the model's predictions.
'**f1_score**': Computes the F1 score, which combines precision and recall.
You can pass a single metric as a string or a list of metrics. For example:

In [None]:
model.compile(metrics=['accuracy', 'precision'])

**The fit() method is used to train a model in TensorFlow. It takes several parameters that define the training process. Let's explore the parameters in more detail:**

In [None]:
model.fit(
    x=None,
    y=None,
    batch_size=None,
    epochs=1,
    verbose=1,
    callbacks=None,
    validation_split=0.0,
    validation_data=None,
    shuffle=True,
    class_weight=None,
    sample_weight=None,
    initial_epoch=0,
    steps_per_epoch=None,
    validation_steps=None,
    validation_batch_size=None,
    validation_freq=1,
    max_queue_size=10,
    workers=1,
    use_multiprocessing=False,
)


**x**: The x parameter represents the input data for training the model. It can be a NumPy array or a TensorFlow Dataset object containing the training samples.

**y**: The y parameter represents the target values or labels corresponding to the input data. It should have the same length or shape as x.

**batch_size**: The batch_size parameter determines the number of samples per gradient update. It defines how many samples are processed before the model's weights are updated. Training in batches can help with memory efficiency and computational speed. If not specified, it defaults to None, meaning the default batch size is determined automatically.

**epochs**: The epochs parameter specifies the number of times the model will iterate over the entire training dataset. Each epoch consists of one forward pass and one backward pass (gradient update) through the entire dataset.

**verbose**: The verbose parameter controls the verbosity level during training. It determines the amount of information displayed during training. Setting verbose=1 will display a progress bar and training metrics, while verbose=0 will silence the output. There are other options for different levels of verbosity.

**callbacks**: The callbacks parameter allows you to specify a list of callbacks to be applied during training. Callbacks are objects that can perform various actions at different stages of training, such as saving model checkpoints, early stopping, or custom logging.

**validation_split**: The validation_split parameter specifies the fraction of the training data to be used for validation. For example, setting validation_split=0.2 will use 20% of the training data for validation, while the remaining 80% is used for training. Note that validation_split is applied after shuffling the data, if enabled.

**validation_data**: The validation_data parameter allows you to provide explicit validation data for evaluating the model's performance during training. It should be a tuple of the form (x_val, y_val) representing the validation inputs and targets.

**shuffle**: The shuffle parameter determines whether the training data should be shuffled at the beginning of each epoch. Shuffling the data helps introduce randomness and avoid any potential bias during training. By default, it is set to True.

**class_weight**: The class_weight parameter allows you to assign different weights to different classes in case of class imbalance. It can be useful when certain classes have fewer samples and you want to give them more importance during training.

These are some of the key parameters of the fit() method. There are additional parameters for more advanced use cases, such as distributed training, data parallelism, and handling large datasets. You can refer to the TensorFlow documentation for more information on these parameters and their usage.