# Dependencies

- OS `os`
- MLFlow `mlflow`
- PyTorch, PyTorch Vision `torch`
- Optuna `optuna`

---

# Project Definition

In the **Digikala** sellers panel, when sellers add their products to the website for
sale, they must also submit a number of images of that product for each product; But not
every image can be appropriate because images must meet a predetermined conditions.
One of these conditions is the absence of
any watermarks on the image. In this issue,
we ask you to use the data provided to you to train a model that is able to detect
the presence of watermarks.

One of these conditions is the **absence of any watermarks** on the
image. In this case, we want to use the data we have to teach a model
that is able to detect the presence of watermarks.

The data set we have has **training** and **test** parts. In the training part,
a set of images that have a watermark are in a
`positive` folder and a set of images that
do not have any watermark are in a `negative` folder.

Using these images, we train our machine learning algorithm
and then predict whether each of the images in the test folder has a watermark.

---

# Learning Model

In this project, we use **Inception V3** **End-To-End** pre-trained deep neural network with the help of **Transfer Learning** technique.

Transfer learning brings a range of benefits to the development process of machine learning models. The main benefits of transfer learning include the saving of resources and improved efficiency when training new models.

> [End-to-End Models](https://www.capitalone.com/tech/machine-learning/pros-and-cons-of-end-to-end-models/)

> [Transfer Learning](https://en.wikipedia.org/wiki/Transfer_learning#:~:text=Transfer%20learning%20(TL)%20is%20a,when%20trying%20to%20recognize%20trucks.)

End-to-end models have a number of advantages relative to component-based systems, but they also have some disadvantages.

Advantages of end-to-end models:
- **Better metrics**: Currently, the systems with the best performance according to metrics such as precision and recall tend to be end-to-end models.
- **Simplicity**: End-to-end models avoid the sometimes thorny problem of determining which components are needed to perform a task and how those components interact. In component-based systems, if the output format of one component is changed, the input format of other components may need to be revised.
- **Reduced effort**: End-to-end models arguably require less work to create than component-based systems. Component-based systems require a larger number of design choices.
- **Applicability to new tasks**: End-to-end models can potentially work for a new task simply by retraining using new data. Component-based systems may require significant re-engineering for new tasks.
- **Ability to leverage naturally-occurring data**: End-to-end models can be trained on existing data, such as translations of works from one language to another, or logs of customer service agent chats and actions. Component-based systems may require creation of new labeled data to train each component.
- **Optimization**: End-to-end models are optimized for the entire task. Optimization of a component-based system is difficult. Errors accumulate across components, with a mistake in one component affecting downstream components. Information from downstream components can’t inform upstream components.
- **Lower degree of dependency on subject matter experts**: End-to-end models can be trained on naturally-occurring data, which reduces the need for specialized linguistic and domain knowledge. But expertise in deep neural networks is often required.
- **Ability to fully leverage machine learning**: End-to-end models take the idea of machine learning to the limit.

---

# Code

The main classes that implement the overall logic of the code are as follows:

1. **`HyperParameterOptimization` class**: This class is
    responsible for Hyper Parameter Fine-Tuning with the help
    of `ModelInitializer` and `ModelTrainer` classes and uses
    `optuna` library for this responsibility.

    The main function of this class is `tune`, which defines
    an `study` on the hyper parameter value space. To define
    `study` in the `optuna` library, it is necessary to define
    the `objective_function` manually.

  ```
  def tune(self):
      study = optuna.create_study(direction="maximize", sampler=optuna.samplers.TPESampler())

      study.optimize(self.hyper_parameter_optimization_objective_function, n_trials=self.n_trial)

      best_trial = study.best_trial

      params = best_trial.params

      print('*' * 160)
      for k, v in params.items():
          print("{:<15}{:<25}".format(k, str(v)))

      print('*' * 160)

      hyperparameters_configs = HyperParameterConfigs(params)
      hyperparameters_configs.set_metrics()

      self.model_initializer.initialize_with_hyper_parameters(hyperparameters_configs)

      _ = self.model_trainer.fit_model_with_setup(self.model_initializer.train_dataloader,
                                                  self.model_initializer.validation_dataloader,
                                                  self.model_initializer.model,
                                                  self.model_initializer.loss_criterion,
                                                  self.model_initializer.optimizer,
                                                  self.model_initializer.device)

      fine_tuned_model = self.model_initializer.model

      return fine_tuned_model, hyperparameters_configs
  ```



2. **`ModelInitializer` class**: This class implements the following step:
    1. In the first step, it reads the data of the
    training phase from the file and then divides it to the
     training and validation subset (pytorch `Subset` class)
     and in the next step, it converts this `Subset` instances
     to pytorch `Dataset` class by applying some transforms.

    2. In the second step, it applies
     transforms to the training
    and validation subset as follows:

        ```
        train_transforms = transforms.Compose([
            transforms.Resize(size=image_data_shape),
            transforms.ToTensor(),
            transforms.RandomHorizontalFlip(p=probability),
            transforms.RandomPerspective(p=probability, distortion_scale=.5),
            transforms.Normalize(mean, standard_deviation)
        ])

        validation_transforms = transforms.Compose([
            transforms.Resize(size=image_data_shape),
            transforms.ToTensor(),
            transforms.Normalize(mean, standard_deviation)
        ])

        train_dataset = DatasetFromSubset(self.train_subset, transform=train_transforms)
        validation_dataset = DatasetFromSubset(self.validation_subset, transform=validation_transforms)
        ```
    3. In the third step, with the help of the datasets created
    in the previous step, it builds the training and validation dataloader.
    (pytorch `DataLoader` class, in the model training process, breaks down
    training and validation datasets into mini-batches of `batch_size` size.)

        ```
        # come from hyper parameter configs object as input
        batch_size = hp_configs.batch_size
        shuffle = True
        drop_last = False
        num_workers = 4

        self.train_dataloader = data.DataLoader(train_dataset,
                                                batch_size=batch_size,
                                                shuffle=shuffle,
                                                drop_last=drop_last,
                                                num_workers=num_workers,
                                                pin_memory=True)

        self.validation_dataloader = data.DataLoader(validation_dataset,
                                                     batch_size=batch_size,
                                                     shuffle=shuffle,
                                                     drop_last=drop_last,
                                                     num_workers=num_workers,
                                                     pin_memory=True)
        ```
    4. In the fourth step, the following
       items are calculated using the hyper
       parameter values given as input (`HyperParameterConfigs` class):

        * Device `torch.device`:
        ```
        self.device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')
        ```
        * Model `torchvision.models.Inception3`
        ```
        model: models.Inception3 = models.inception_v3(pretrained=True, progress=True)

        last_fc_layer_input_number = model.fc.in_features
        model.fc = nn.Linear(last_fc_layer_input_number, number_of_classes)

        model = model.to(self.device)

        self.model = model
        ```
        * Optimizer `torch.optim.Adam`
        ```
        self.optimizer = optim.Adam(params=model.parameters(), lr=lr, weight_decay=weight_decay)
        ```
        * Loss Criterion `torch.nn.CrossEntropyLoss`
        ```
        self.loss_criterion = nn.CrossEntropyLoss()
        ```
        * Learning Rate Scheduler `torch.optim.lr_scheduler.ExponentialLR`
        ```
        self.lr_scheduler = optim.lr_scheduler.ExponentialLR(self.optimizer, gamma=scheduler_gamma)
        ```

   In the following, all these items are passed to `ModelTrainer` class object as input.

3. **`ModelTrainer` class**: This class, using the items created
    in the previous section, conducts the neural network model
    training process.

    The main function of this class is `fit_model_with_setup`,
    which the main part of this function is as follows:

   ```
   for epoch_i in range(self.epochs):
       train_true_labels, train_predicted_scores = self.train_loop()
       test_true_labels, test_predicts_proba = self.test_loop()

       train_auc, train_accuracy, train_f_score, train_loss = self.calculate_metrics(train_true_labels, train_predicted_scores)
       validation_auc, validation_accuracy, validation_f_score, validation_loss = self.calculate_metrics(test_true_labels, test_predicts_proba)

       current_epoch_metrics_dictionary = {
           TRAIN_AUC_STR: train_auc,
           TRAIN_ACCURACY_STR: train_accuracy,
           TRAIN_F1SCORE_STR: train_f_score,
           TRAIN_LOSS_STR: train_loss,

           VALIDATION_AUC_STR: validation_auc,
           VALIDATION_ACCURACY_STR: validation_accuracy,
           VALIDATION_F1SCORE_STR: validation_f_score,
           VALIDATION_LOSS_STR: validation_loss,
        }

       metrics_tracker.insert(epoch_i, current_epoch_metrics_dictionary)

       mlflow.log_metrics(current_epoch_metrics_dictionary)

       print("{:<15}{:<25}{:<25}{:<25}{:<25}{:<25}{:<25}".format(epoch_i + 1, train_loss, validation_loss, train_f_score, validation_f_score, train_accuracy, validation_accuracy))
       print('-' * 160)
    ```
4. **`ModelEvaluator` class**: This class, using the model
 obtained in the previous section, calculates
 the label of test data that has not been used
  in the training process at all.

---

# Results

With the help of **MLFlow** and by executing the following command, the results of the training process can be seen in the browser.

---

# Predict Test Data Labels 

The prediction of the trained model on test data (that has not been used in the training process.) is
stored in the `output.csv` file.