# Image Quality
- Brisque - https://pypi.org/project/brisque/
- image-quality 1.2.7 - https://pypi.org/project/image-quality/
- NIMA for aesthetic quality - https://github.com/yunxiaoshi/Neural-IMage-Assessment

For assessing image quality we have to consider two aspects. Technical and aesthetic quality of an image. Our usage involves no reference for quality assessment. We are going to look into how fast and sensitive the image quality evaluator is and evaluate which one is the most efficient for our use.

For testing we created small folder with image and its various augmentations.

In [32]:
import time,os
import torchvision.transforms as transforms
# We create list of images that we use for testing

image_path = '/home/lukas/Bakalářka/photo_culling/images/testing'
img_list = []  # list of image file names to process
for path in os.scandir(image_path):
    if path.is_file():
        if path.name.endswith(".jpg"):
            img_list += [path.name]

# TECHNICAL QUALITY
Firstly we are going to focus on technical quality. For technical quality we can choose from two approaches, one that is algorithmic and focused on scene statistics and the second one, which is using CNN trained on TID2013 dataset.

For algorithmic approach we tested two implementations of the same algorithm called BRISQUE (Blind/referenceless image spatial quality evaluator). First implementation is library created by Rehan Guha (https://pypi.org/project/brisque/). 

In [33]:
from brisque import BRISQUE
from skimage import io

obj = BRISQUE(url=False)
results_BRISQUE = []

tic = time.perf_counter()
for img in img_list:
    x = io.imread(os.path.join(image_path,img))
    results_BRISQUE.append(obj.score(x))
toc = time.perf_counter()
print(f"TIME - {toc - tic:0.2f} s")
print(img_list)
print(results_BRISQUE)

TIME - 53.66 s
['clear.jpg', 'GaussBlur.jpg', 'rotated.jpg', 'inverted.jpg', 'hue_shift.jpg', 'contrast.jpg']
[63.439140291231155, 96.60843292281524, 63.230106082455876, 63.30271773240301, 59.23486617151738, 71.03791634023887]


From this first test we can see that for this algorithm rotating and inverting has little to no effect. For shifted hue and increased contrast we can observe
small change in quality score. And for Gaussian blur we can see the biggest quality score drop. This is expected as the technical quality should only be measured
by pixels relation to its surroundings.

Now we can try testing the second implementation of BRISQUE from image-quality library made by Ricardo Ocampo. In this implementation we need to open the images
with Pillow image library function. This is slight downside.

In [28]:
import PIL.Image
import imquality.brisque

results_BRISQUE = []

tic = time.perf_counter()
for img in img_list:
    x = PIL.Image.open(os.path.join(image_path,img))
    x = imquality.brisque.score(x)
    results_BRISQUE.append(x)
toc = time.perf_counter()
print(f"TIME - {toc - tic:0.2f} s")
print(img_list)
print(results_BRISQUE)

KeyboardInterrupt: 

From the second test we can observe the same as in the first one. The most significant score change is in the case of blurring the image. The quality scores
for both implementations are nearly identical.

Both of these solutions are very easy-to-use, give us single score number representing the technical quality and are linear in terms of computing time. That said,
the first implementation is around 5 times faster and doesn't require us to open images beforehand. From these factors we can already see that the first
implementation is superior of those two.

Now we can go ahead and test CNN approach to technical quality assessment.
- For this I haven't been able to test it yet. As I struggle to build the network with pretrained weights.

# AESTHETIC QUALITY
For aesthetics there are significantly fewer solutions as the question of aesthetics is vey subjective and so it's complicated to create an algorithmic solution.
For this reason we are going to focus on deep learning approach with CNN trained on AVA dataset. This dataset is created with images of amateur photographers and as such they are focused on aesthetic quality to the images. The rating is trained on ratings of the public and hence it is as close to objective beauty rating as we can currently get. The output of the network is distribution of ratings that is simulating an actual distribution of ratings that people might give. From this distribution we are then able to get mean rating and the statistical deviation of the rating.

We used implementation inspired by https://github.com/yunxiaoshi/Neural-IMage-Assessment. 


In [34]:
from PIL import Image
import torchvision.models as models
import torch
import torch.nn as nn

class NIMA(nn.Module):
    """Neural IMage Assessment model by Google"""
    def __init__(self, base_model, num_classes=10):
        super(NIMA, self).__init__()
        self.features = base_model.features
        self.classifier = nn.Sequential(
            nn.Dropout(p=0.75),
            nn.Linear(in_features=25088, out_features=num_classes),
            nn.Softmax(dim=1))

    def forward(self, x):
        out_f = self.features(x)
        out = out_f.view(out_f.size(0), -1)
        out = self.classifier(out)
        return out_f,out

base_model = models.vgg16(weights=models.VGG16_Weights.DEFAULT)
model = NIMA(base_model)
model.load_state_dict(torch.load(os.path.join(os.getcwd(), 'model.pth'), map_location=torch.device('cpu')))
seed = 42
torch.manual_seed(seed)
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = model.to(device)
model.eval()
            
test_transform = transforms.Compose([
        transforms.Resize(256),
        transforms.RandomCrop(224),
        transforms.ToTensor(),
        transforms.Normalize(mean=[0.485, 0.456, 0.406],
                             std=[0.229, 0.224, 0.225])
])

res_list = []
mean, std = 0.0, 0.0
tic = time.perf_counter()
for img in img_list:
    im = Image.open(os.path.join(image_path, str(img))).convert('RGB')
    imt = test_transform(im)
    imt = imt.unsqueeze(dim=0)
    imt = imt.to(device)
    with torch.no_grad():
        out_f, out_class = model(imt)
    out_class = out_class.view(10, 1)
    for j, e in enumerate(out_class, 1):
        mean += j * e
    for k, e in enumerate(out_class, 1):
        std += e * (k - mean) ** 2
    std = std ** 0.5
    mean = int(mean.item()*100)/100
    std = int(std.item()*100)/100
    res_list.append((mean,std))
    mean, std = 0.0, 0.0
toc = time.perf_counter()
print(f"TIME - {toc - tic:0.2f} s")
print(img_list)
print(res_list)


TIME - 2.74 s
['clear.jpg', 'GaussBlur.jpg', 'rotated.jpg', 'inverted.jpg', 'hue_shift.jpg', 'contrast.jpg']
[(6.24, 1.21), (5.53, 1.47), (6.13, 1.32), (6.35, 1.49), (6.21, 1.27), (5.93, 1.37)]


From the results we can see that inverting and changing the hue 