***
<img src='./saves/manganda.png'></img>
<div class="center" style="padding: 1px; height: 70px; background: black; text-align: center;">
    <h2 style="font-family:Copperplate, Papyrus, fantasy;
               font-size:30px;
               font-style:bold;
               color:white;">
        Regressive Approach in Rating Mangas thru Sample Art</h2>
</div>

***
by : JP Fabrero

***
<div class="center" style="padding: 1px; height: 60px; background: black;">
    <h2 style="font-family:Copperplate, Papyrus, fantasy;
               font-size:30px;
               margin: 15px;
               font-style:bold;
               color:white;">
        Introduction
    </h2>
</div>

***
Are you tired of being inloved in that new manga you're reading, only to know that the series will soon be dropped or discontinued? As a fellow manga enthusiast, I understand the heartbreak and disappointment of investing time and energy into a manga that ultimately falls short. Especially when it can be exhausting to choose your next adventure with countless series puoring in the industry, waiting to be explored. But, what if you could navigate this sea of manga with ease, armed with insights into which series are likely to earn support and last longer?

I've pondered how I might better evaluate mangas without pouring hours of research and reading reviews and decided to tackle this problem by use of Deep Learning. In this blog, I'm taking out all the literary elements of a manga and looking closer at the key visual elements that make a manga shine. Using only samples of manga's panels or pages, I'm building a model that learns all the relevant art style, character design, etc. By analyzing these factors, my model, ManGanda, aims and attempts to provide a sneak peek of a manga's potential rating.

By undertaking this project, I seek to help both publishers and readers make better-informed decisions about which manga series to invest in and promote.

***
<div class="center" style="padding: 1px; height: 60px; background: black;">
    <h2 style="font-family:Copperplate, Papyrus, fantasy;
               font-size:30px;
               margin: 15px;
               font-style:bold;
               color:white;">
        Importing Libraries and Utility Functions
    </h2>
</div>

***
For this project, I made a lot scripts with utility functions for the various necessary steps. I've imported them here, along with some other libraries.

See <a href='https://github.com/jpK1ba/ManGanda/blob/master/utils/utility.py'>code here</a>.

In [1]:
# Necessary customizations for my machine (optional)
import os
os.environ['SKIMAGE_DATADIR'] = '/tmp/.skimage_cache'
os.environ['XDG_CACHE_HOME'] = '/home/msds2023/jfabrero/.cache'

# Importing the necessary libraries and functions
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

from utils.pyjanitor import auto_toc
from utils.pickling import load_pkl, save_pkl
from utils.data_collection import get_data, download_mangas, get_annotations
from utils.dataset import MangaDataset
from utils.dataloader import MangaDataloader
from utils.manganda import MangaModel, train_model
from utils.mangagradcam import MangaGradCAM
from utils.utility import (plot_ratings,
                           get_baseline,
                           eval_model,
                           plot_predictions,
                           plot_test,
                           plot_saliencies)

toc = auto_toc(row_align='left')

# For cleanliness
import warnings
warnings.filterwarnings('ignore')

***
<div class="center" style="padding: 1px; height: 60px; background: black;">
    <h2 style="font-family:Copperplate, Papyrus, fantasy;
               font-size:30px;
               margin: 15px;
               font-style:bold;
               color:white;">
        Data Collection
    </h2>
</div>

***
At the roots of all great results are robust models and relevant data. With that in mind, to initiate this endeavor, I first gathered a list of top rated mangas from MangaList, https://myanimelist.net/topmanga.php?type=manga, alongside with the target, the manga rating. I made the `get_data` function to take care of that.

Next was collecting the manga samples, I made a web crawler to scour a manga scan site I know of, try to look for each manga in the list, and download the panels if available. For fidelity, the sampling were done by getting a set of random chapters and taking the a defined number of panels in the middle of each chapter (this is to avoid getting cover pages). The function `download_mangas` does just that.

Finally, to conclude the data collection, I made a function to prepare the annotations for all the successfully sampled mangas in preparation for model training. Unfortunately, during my initial training attempts, I found a lot of errors and discrepancies such as having duplicates and corrupted images. With that, I added some checking functionalities in the `get_annotations` to handle those and ensure that the images to be utilized are fit for training.

See <a href='https://github.com/jpK1ba/ManGanda/blob/master/utils/data_collection.py'>code here</a>.

In [2]:
df_top = get_data(5_000) # Top 5_000
df_dl = download_mangas(df_top) # Dataframe of sampled manga details
get_annotations(df_dl) # Write annotations based from sampled mangas

Samples are already downloaded!
Download Rate: 25.65%


In [3]:
plt.imshow(plt.imread('saves/tsukki.png')) # Displaying a sample panel
plt.axis('off')
toc.add_fig('Sample Panel - Haikyuu!! by Furudate Haruichi', width=100)

***
<div class="header" style="
  padding: 20px;
  background: black;">
    <h2 style="font-family:Copperplate, Papyrus, fantasy;
               font-size:30px;
               font-style:bold;
               color:white;">
        Dataset and Dataloader
    </h2>
</div>

***
Next, I utilized the Torch library to create custom datasets and build the dataloader.

In the dataset, `MangaDataset`, every data point or image is preprocessed by using the transformation functions to achieve the following: get the maximum square crop from the panel sample, grayscale the image (for uniformity, manhwas were also somehow part of the top rated manga list), apply horizontal flipping randomly, and resize each cropped panel to be compatible with the model I'm going to use later on. The entire dataset was split into training, validation, and testing sets.

In this section, I also displayed **Table 1. Preview of Data Annotations.** to give you an idea of what the dataloader uses and plotted the histogram of the target variable - `rating`. You can observed below in **Figure 2. Distribution of Ratings** that since the list of mangas were limited to the top 5,000 rated series, the distribution of `rating`s seemingly follows the right-tail end of a bell curve.

Finally, we also set a baseline value for an acceptable model's performance. I used the dataset's variance as the baseline. In effect, I'm going to compare the variability of the model's predictions against the inherent variability in the dataset.

See <a href='https://github.com/jpK1ba/ManGanda/blob/master/utils/dataset.py'>code here</a>.

In [4]:
# Importing the built Dataset
dataset = MangaDataset()

In [5]:
# Display a preview of the annotations built
toc.add_table(dataset.annotations.sample(10)[['title', 'rating', 'paths']],
              'Data Annotations')

title,rating,paths
Umi no Misaki,7.38,./data/umi_no_misaki/umi_no_misaki_119_10.jpg
Feng Shen Ji II,8.25,./data/feng_shen_ji_ii/feng_shen_ji_ii_23_17.jpg
Arifureta Shokugyou de Sekai Saikyou,7.71,./data/arifureta_shokugyou_de_sekai_saikyou/arifureta_shokugyou_de_sekai_saikyou_23_12.jpg
The Boxer,8.42,./data/the_boxer/the_boxer_76_67.jpg
Wotaku ni Koi wa Muzukashii,8.37,./data/wotaku_ni_koi_wa_muzukashii/wotaku_ni_koi_wa_muzukashii_15_5.jpg
Madou no Keifu,7.38,./data/madou_no_keifu/madou_no_keifu_9_19.jpg
Noragami,8.4,./data/noragami/noragami_23_24.jpg
Eureka Seven,7.7,./data/eureka_seven/eureka_seven_19_20.jpg
Yoroshiku Master,7.39,./data/yoroshiku_master/yoroshiku_master_11_20.jpg
Fantasy Bishoujo Juniku Ojisan to,7.51,./data/fantasy_bishoujo_juniku_ojisan_to/fantasy_bishoujo_juniku_ojisan_to_42_4.jpg


In [6]:
plot_ratings(dataset, toc)

In [7]:
baseline = get_baseline(dataset)

For the `MangaDataloader`, I've noticed that there were outliers in the dataset - which are valid images but made up mostly of blank spaces or white pixels. To mitigate this, I defined my own `collate_fn` to check wether the loaded image's characterestics lie within the distrution of my training dataset and reject loading it if found to be out of distribution.

See <a href='https://github.com/jpK1ba/ManGanda/blob/master/utils/dataloader.py'>code here</a>.

In [8]:
# Importing the built Dataloader
dataloader = MangaDataloader(dataset, batch_size=24)

***
<div class="header" style="
  padding: 20px;
  background: black;">
    <h2 style="font-family:Copperplate, Papyrus, fantasy;
               font-size:30px;
               font-style:bold;
               color:white;">
        ManGanda Model
    </h2>
</div>

***
To maximize my model's performance, I used a method called `Transfer Learning` which leverages a pre-trained, typically State-of-the-Art, model's learned representations or parameters as a starting point, improving the performance and efficiency of a model. For `Transfer Learning`, choosing a suitable pretrained model with, ideally, domain similarity is crucial to utilize its trained capabilities as transferrable knowledge. With that in mind, I employed a ResNet18 model - pretrained on the Danbooru2018 dataset, developed by Matthew Baas. Danbooru2018 consists of millions of annotated images collected from the Danbooru community, a popular imageboard site for anime and manga enthusiasts. If you want to know more about the pretrained model, you visit Matthew Baas's repository: https://github.com/RF5/danbooru-pretrained. [1]

With the given domain similiraty and relatively smaller dataset, `Feature Extraction` is the best approach. [2] `Feature Extraction` is a type of `Transfer Learning` method that involves using the pretrained model to extract relevant features from the target data and then using these features as input to a separate model for a specific task, i.e. regression.

MangaModel uses the pretrained ResNet18 model and initializes it with my saved and trained Regression network parameters. For my regression network, I added a threshold to discourage the model from sacrificing performance for certain datapoints to achieve lower over-all loss and consequently resulting in poor generalizability.

See <a href='https://github.com/jpK1ba/ManGanda/blob/master/utils/manganda.py'>code here</a>.

In [9]:
ManGanda = MangaModel(dataset, dataloader)

Using cache found in /home/msds2023/jfabrero/.cache/torch/hub/RF5_danbooru-pretrained_master


Layer (type:depth-idx)                   Output Shape              Param #
├─Sequential: 1-1                        [-1, 512, 7, 7]           --
|    └─Conv2d: 2-1                       [-1, 64, 112, 112]        (9,408)
|    └─BatchNorm2d: 2-2                  [-1, 64, 112, 112]        (128)
|    └─ReLU: 2-3                         [-1, 64, 112, 112]        --
|    └─MaxPool2d: 2-4                    [-1, 64, 56, 56]          --
|    └─Sequential: 2-5                   [-1, 64, 56, 56]          --
|    |    └─BasicBlock: 3-1              [-1, 64, 56, 56]          (73,984)
|    |    └─BasicBlock: 3-2              [-1, 64, 56, 56]          (73,984)
|    └─Sequential: 2-6                   [-1, 128, 28, 28]         --
|    |    └─BasicBlock: 3-3              [-1, 128, 28, 28]         (230,144)
|    |    └─BasicBlock: 3-4              [-1, 128, 28, 28]         (295,424)
|    └─Sequential: 2-7                   [-1, 256, 14, 14]         --
|    |    └─BasicBlock: 3-5              [-1, 256, 

***
<div class="header" style="
  padding: 20px;
  background: black;">
    <h2 style="font-family:Copperplate, Papyrus, fantasy;
               font-size:30px;
               font-style:bold;
               color:white;">
        Evaluation
    </h2>
</div>

***
Now that the modelling is done, I'm going to evaluate my ManGanda model's performance and compare it against the previously defined baseline. The imported utility function `eval_model` does that for us.

Lastly, to visualize what the model "sees" and display how it's performing for different manga series, I used `plot_predictions` to randomly select manga, get random samples, feed it into the model, and compare the predictions with the ground truth.

In [10]:
eval_model(ManGanda, dataloader, baseline)

In [11]:
plot_predictions(ManGanda, dataset, toc)

Using different techniques for handling images, ManGanda was born. As evaluated using a test set, the model was able to achieve Test MSE of $0.16$ and Test MAE of $0.31$. With this, ManGanda is able to perform well enough to predict manga ratings, off only by $\pm0.31$ on average.

At first glance, this is already significant, however, I noticed that the model's predictions are seemingly bounded in a small range of values. To illustrate this, I made the function `plot_test` to plot the actual and predicted test ratings, see **Figure  6. Distribution of Test Ratings (True vs Predicted).**.

In [12]:
plot_test(ManGanda, dataloader, toc)

***
<div class="header" style="
  padding: 20px;
  background: black;">
    <h2 style="font-family:Copperplate, Papyrus, fantasy;
               font-size:30px;
               font-style:bold;
               color:white;">
        Explainability
    </h2>
</div>

***

It's my intuition the model learned the `mean rating` of the mangas as its bias for the last linear layer and that the incremental values are caused by the visual element it finds in the fed sample panels.

As we've already seen what the model "sees", I'm now going to visualize what the model "looks at" using a technique called GradCAM. [3] GradCAM, short for Gradient-weighted Class Activation Mapping, is a technique that essentially provides a heat map visualization highlighting the regions of the image that contribute the most to the model's prediction. Grad-CAM is commonly used for classification tasks but in this use case, I'm using it to emphasizes the visual areas that influences my model.

With GradCAM, it can be interpretted that the cool spots (blue) affect the predicted rating minimally and that hot spots (red) are the areas that drive the prediction the most. This is visualized below in **Figure 7. GradCAM Implementation - \<Manga Title>**.

See <a href='https://github.com/jpK1ba/ManGanda/blob/master/utils/mangagradcam.py'>code here</a>.

In [13]:
ManGandaCAM = MangaGradCAM(ManGanda)
plot_saliencies(ManGandaCAM, ManGanda, toc)

***
<div class="header" style="
  padding: 20px;
  background: black;">
    <h2 style="font-family:Copperplate, Papyrus, fantasy;
               font-size:30px;
               font-style:bold;
               color:white;">
        Conclusion
    </h2>
</div>

***

What a blast, definitely had fun with this project. Started from building my own dataset by Web Scraping. Building custom Datasets and Dataloaders using Pytorch. Finally, proceeded to retraining a Pretrained ResNET model and coming up with ManGanda. It's best predictive performance so far is off only by $\pm0.31$ on average. Although the performance of the model seem satisfactory, it still have some room for improvement. Ultimately, ManGanda further leveraged XAI in cracking down the key visual elements with hope to guide artists of what works or don't.

All the codes are linked to avoid cluttering this notebook. 

***
<div class="header" style="
  padding: 20px;
  background: black;">
    <h2 style="font-family:Copperplate, Papyrus, fantasy;
               font-size:30px;
               font-style:bold;
               color:white;">
        References
    </h2>
</div>

***

[1] Matthew Baas. (2019). Danbooru2018 pretrained resnet models for PyTorch. GitHub. https://github.com/RF5/danbooru-pretrained

[2] Elgendy, M. (2020, November 10). Deep Learning for Vision Systems. Manning Publications Co.

[3] Selvaraju, Ramprasaath R., Michael Cogswell, Abhishek Das, Ramakrishna Vedantam, Devi Parikh, and Dhruv Batra. “Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization.” 2017 (October 2017): 618–626, https://doi.org/10.1109/ICCV.2017.74.