Use mean absolute deviation for pooling #89

igv · 2020-12-28T08:58:09Z

No description provided.

kornelski · 2020-12-29T21:51:19Z

That works great, but why does it have .powf(1/2^n)?

kornelski · 2020-12-30T00:46:55Z

That weight per scale score tweak hinted that perhaps the original scale weights were non-optimal. I've adjusted them, and that gave a decent improvement.

igv · 2020-12-30T03:42:17Z

but why does it have .powf(1/2^n)?

My tweak. Initially it was .powf(0.5) (MAD used in MDSI metric and they set it to 0.25 (EDIT: I forgot, it's a different parameter in MDSI)), but then I found out that when comparing only at first scale (SSIM), the best accuracy is with .powf(1) (almost on par with VIF, try it), so I tested it with each scale separately.
It's called mean absolute deviation, but you can also use median or mode or your own weighted mean.

I've adjusted them, and that gave a decent improvement.

I wouldn't change default scale weights only based on 1 dataset. Also Full accuracy on tid13 is so high with these weights because of Exotic group (which is pretty much irrelevant).

Finally, according to tid2013 paper, SSIM is among the best metrics for "Good quality" (Table 6), so maybe add -ssim switch to calculate SSIM instead of MS-SSIM.

igv · 2020-12-30T04:11:07Z

Afaik, this is how these scale weights calculated:

cpd represent cycles per degree of visual angle which is determined by the viewing resolution factor.

kornelski · 2020-12-30T17:37:12Z

Yes, indeed I'm worried about overfitting for TID2013. Can you recommend other similar datasets that I could use to verify accuracy?

igv · 2020-12-30T18:25:28Z

The most similar is KADID-10k. Only they made a mistake and placed reference images instead of first level distorted images for 3 distortion types (1st, 3rd and 8th).
The most popular are LIVE and CSIQ.
For some reason not very popular, but seems good - ESPL.

kornelski · 2020-12-30T21:58:55Z

With my max+avg pooling method my thinking was:

if there's a small but significant distortion in the image (e.g. a single bad pixel), then it actually affects a larger area than itself, because it's likely to be spoiling an entire object/area it is in. This is why there's a "max" pooling step.
But an image likely contains multiple objects, and one flaw is not disqualifying the whole image, which is why there's an averaging step too.

That solution is not mathematically elegant, but I like it, because it intuitively makes sense to me.

I've read how MAD works, but I'm not quite sure why this particular formula, especially with the "tweak", happens to work for pooling SSIM.

My hypothesis is:

For small n in pow(1/2^n) MAD is behaving like standard deviation, so it gives worse scores to images with multiple bad areas, but is insensitive to uniformly bad images.
For bigger n it makes avg approach 1, which makes MAD behave more like a simple average, which lets the other layers still catch uniformly bad images.

That makes me wonder whether this tweaked MAD just happens to be another way of mixing "worst" vs "average" error pooling, just not within each scale, but across scales.

And I wonder whether the proportion from pow(1/2^n) is ideal, or maybe it should be based on scale weights. This is something to experiment with.

igv · 2020-12-31T04:57:30Z

It doesn't behave like std dev (I edited my comment above about MAD in MDSI) and it always makes avg approach 1 (that's the idea) except at 1st scale. 1/pow(1/2^n) is just a short for [1, 0.5, 0.25, 0.125, 0.0625][n], it has no relation to scale weights and impact on accuracy and ssim scores is minimal (vs for example [1.0, 0.5, 0.5, 0.5, 0.5][n]).

At first, after arithmetic mean, I tried geometric and harmonic means. They made accuracy worse (harmonic was the worst). From wikipedia: For all positive data sets containing at least one pair of nonequal values, the harmonic mean is always the least of the three means, while the arithmetic mean is always the greatest of the three and the geometric mean is always in between. So I added that exponent to bias avg towards good quality scores. The bigger downsampling factor the higher and less spread per-pixel ssim scores are (except for some Exotic distortion types probably), so the exponent value must be lower to be effective.

That makes me wonder whether this tweaked MAD just happens to be another way of mixing "worst" vs "average" error pooling, just not within each scale, but across scales.

It's still within each scale, not across scales (cross scale pooling is in this block).

And I wonder whether the proportion from pow(1/2^n) is ideal

Maybe not ideal, but I tried with different constants (like 1.5, 1.75, 2.25, 2.5) and also with inverse square law formula (1/n^2).

igv · 2021-01-02T07:04:16Z

Tested on KADID-10k using scripts from here (only modified kadid10k.ROCC.py to show per type and per level accuracy).

1st is max+avg, 2nd is mad with non-custom scale weights.

Full SROCC:
-0.8525045482307988
-0.8562234657395806

Per type:

1:
-0.9482709952684674
-0.9524144401242203

2:
-0.888920125560901
-0.9036008877848952

3:
-0.9361601718850355
-0.942732720555493

4:
-0.9035452008410343
-0.9054443006074244

5:
-0.7505439463819739
-0.7721170099865738

6:
-0.877401725595138
-0.8779770694488326

7:
-0.6083002027988245
-0.5992010191419064

8:
-0.9202345152541264
-0.9192436523541861

9:
-0.9316652477020476
-0.9389697724114447

10:
-0.8725177456252896
-0.8922033461740347

11:
-0.8973088936601757
-0.8921109535136115

12:
-0.9312734837315821
-0.9300191542963452

13:
-0.8081662575202928
-0.8028804197128389

14:
-0.940544646168691
-0.9413759571962214

15:
-0.8984371485985785
-0.8985430184742703

16:
-0.9182612634610633
-0.9197175157873034

17:
-0.8913491941768678
-0.8932455426471694

18:
-0.7350325589908322
-0.7442556609995762

19:
-0.9281146136611702
-0.931734634444742

20:
-0.36244064860224723
-0.37055703484238806

21:
-0.4742315253111955
-0.48728508844891577

22:
-0.8585757413960478
-0.8672451350219674

23:
-0.5884619420813891
-0.5820224099228524

24:
-0.8857312844251738
-0.8900615092376575

25:
-0.75146916282513
-0.7677897943743471

Per level:

-0.7594226220260838
-0.7568424979839856

-0.7859239032568683
-0.7905589936285138

-0.7645387478162078
-0.7726706920108036

-0.6711213371339377
-0.6899222144730359

-0.598736695180533
-0.6224275615532653

igv · 2021-01-02T07:41:44Z

kadid10k.ROCC.py:

import pandas as pd

df = pd.read_csv('kadid10k.dssim1.csv')
df2 = pd.read_csv('kadid10k.dssim2.csv')

print('Full SROCC:')
print(df[['dmos', 'DSSIM']].corr('spearman').loc["DSSIM"][0])
print(df2[['dmos', 'DSSIM']].corr('spearman').loc["DSSIM"][0])

print('\nFull KROCC:')
print(df[['dmos', 'DSSIM']].corr('kendall').loc["DSSIM"][0])
print(df2[['dmos', 'DSSIM']].corr('kendall').loc["DSSIM"][0])

print("\nPer type:")
for i in range(25):
    a=df['dist_img'].str.split("_", expand=True)[:][1].astype(int).isin([i+1])
    print("{}:".format(i+1))
    print(df[['dmos', 'DSSIM']][a].corr('spearman').loc["DSSIM"][0])
    print(df2[['dmos', 'DSSIM']][a].corr('spearman').loc["DSSIM"][0])
    print("\n------------------")

print("\nPer level:")
for i in range(5):
    a=df['dist_img'].str.split("_", expand=True)[:][2].str.split(".", expand=True)[:][0].astype(int).isin([i+1])
    print(df[['dmos', 'DSSIM']][a].corr('spearman').loc["DSSIM"][0])
    print(df2[['dmos', 'DSSIM']][a].corr('spearman').loc["DSSIM"][0])
    print("\n------------------")

igv · 2021-01-06T06:49:09Z

Btw, there is my implementation of MDSI in Python. I removed it from my repo because I didn't like this metric very much. But maybe it's actually good?

kornelski · 2021-01-06T14:59:31Z

Thanks for the data. It looks good.

It makes DSSIM output scores that have a different magnitude than previously, so I'll probably add some rescaling fudge in order to avoid breaking users to much.

igv · 2021-01-11T16:11:55Z

Thinking 5x5 binomial / Gaussian (std=1) down-sampling filter might work better with MAD than 2x2 avg.

kornelski · 2021-01-14T00:56:37Z

I guess it could help a little. IIRC change of 2x box blur to proper Gaussian kernel in SSIM blurring helped a little too.

igv · 2021-01-26T08:14:38Z

Made a python version of this metric (grayscale only) for testing, with an alternative version of MAD.

import sys
from PIL import Image
import numpy as np
from scipy.ndimage import gaussian_filter

WEIGHTS = [0.0448]#, 0.2856, 0.3001, 0.2363, 0.1333]

def msssim(file1, file2):
    img1 = Image.open(file1).convert('RGB')
    img2 = Image.open(file2).convert('RGB')

    width, height = img1.size
    img1 = np.frombuffer(img1.tobytes(), dtype=np.uint8).reshape(height, width, 3) / 255
    img2 = np.frombuffer(img2.tobytes(), dtype=np.uint8).reshape(height, width, 3) / 255
    
    img1 = np.where(img1 > 0.04045, np.power((img1 + 0.055) / 1.055, 2.4),  img1 / 12.92)
    img2 = np.where(img2 > 0.04045, np.power((img2 + 0.055) / 1.055, 2.4),  img2 / 12.92)

    img1 = 0.2126 * img1[:,:,0] + 0.7152 * img1[:,:,1] + 0.0722 * img1[:,:,2]
    img2 = 0.2126 * img2[:,:,0] + 0.7152 * img2[:,:,1] + 0.0722 * img2[:,:,2]

    mssim = []
    for i in range(len(WEIGHTS)):
        mssim.append(ssim(pow(img1,1./2.2), pow(img2,1./2.2), i, i<len(WEIGHTS)-1))
        #img1 = avgpooling(img1, 2)
        #img2 = avgpooling(img2, 2)
        img1 = gaussian_filter(img1, 1.08, truncate=1.5)[::2,::2]
        img2 = gaussian_filter(img2, 1.08, truncate=1.5)[::2,::2]

    return np.sum(np.multiply(np.stack(mssim), WEIGHTS)) / np.sum(WEIGHTS)

def mad(x, l):
    return np.mean(np.absolute(x - np.power(np.mean(x), np.power(.5, l)))) # np.mean(np.absolute(x - np.mean(x if l==0 else np.sort(x, axis=None)[-int(x.size//1.5):])))

def ssim(L1, L2, lvl, cs_map):
    C1=(0.01)**2
    C2=(0.03)**2
    sd, t = 1.5, 3 #kernel radius = round(sd * truncate)

    mu1 = gaussian_filter(L1, sd, truncate=t)
    mu2 = gaussian_filter(L2, sd, truncate=t)
    mu1_sq = mu1 * mu1
    mu2_sq = mu2 * mu2
    mu1_mu2 = mu1 * mu2
    sigma1_sq = gaussian_filter(L1 * L1, sd, truncate=t) - mu1_sq
    sigma2_sq = gaussian_filter(L2 * L2, sd, truncate=t) - mu2_sq
    sigma12 = gaussian_filter(L1 * L2, sd, truncate=t) - mu1_mu2

    if cs_map:
        value = (2.0*sigma12 + C2)/(sigma1_sq + sigma2_sq + C2)
    else:
        value = ((2.0*mu1_mu2 + C1)*(2.0*sigma12 + C2))/((mu1_sq + mu2_sq + C1)*
                    (sigma1_sq + sigma2_sq + C2))

    return mad(value, lvl)

def avgpooling(mat, ksize):
    m, n = mat.shape[:2]
    ny = m // ksize
    nx = n // ksize
    mat_pad = mat[:ny*ksize, :nx*ksize, ...]
    new_shape = (ny, ksize, nx, ksize) + mat.shape[2:]
    result = np.nanmean(mat_pad.reshape(new_shape), axis=(1,3))
    return result

def main():
    for arg in sys.argv[2:]:
        score = msssim(sys.argv[1], arg)
        print(str(score) + "\t" + arg)

if __name__ == '__main__':
    main()

Use mean absolute deviation for pooling

a215523

kornelski merged commit 1a40e97 into kornelski:main Jan 29, 2021

igv mentioned this pull request Sep 25, 2023

builtin.conf values for gpu-hq mpv-player/mpv#12082

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use mean absolute deviation for pooling #89

Use mean absolute deviation for pooling #89

igv commented Dec 28, 2020 •

edited

Loading

kornelski commented Dec 29, 2020

kornelski commented Dec 30, 2020

igv commented Dec 30, 2020 •

edited

Loading

igv commented Dec 30, 2020

kornelski commented Dec 30, 2020

igv commented Dec 30, 2020 •

edited

Loading

kornelski commented Dec 30, 2020

igv commented Dec 31, 2020 •

edited

Loading

igv commented Jan 2, 2021 •

edited

Loading

igv commented Jan 2, 2021

igv commented Jan 6, 2021 •

edited

Loading

kornelski commented Jan 6, 2021

igv commented Jan 11, 2021

kornelski commented Jan 14, 2021

igv commented Jan 26, 2021 •

edited

Loading

Use mean absolute deviation for pooling #89

Use mean absolute deviation for pooling #89

Conversation

igv commented Dec 28, 2020 • edited Loading

kornelski commented Dec 29, 2020

kornelski commented Dec 30, 2020

igv commented Dec 30, 2020 • edited Loading

igv commented Dec 30, 2020

kornelski commented Dec 30, 2020

igv commented Dec 30, 2020 • edited Loading

kornelski commented Dec 30, 2020

igv commented Dec 31, 2020 • edited Loading

igv commented Jan 2, 2021 • edited Loading

Per type:

Per level:

igv commented Jan 2, 2021

igv commented Jan 6, 2021 • edited Loading

kornelski commented Jan 6, 2021

igv commented Jan 11, 2021

kornelski commented Jan 14, 2021

igv commented Jan 26, 2021 • edited Loading

igv commented Dec 28, 2020 •

edited

Loading

igv commented Dec 30, 2020 •

edited

Loading

igv commented Dec 30, 2020 •

edited

Loading

igv commented Dec 31, 2020 •

edited

Loading

igv commented Jan 2, 2021 •

edited

Loading

igv commented Jan 6, 2021 •

edited

Loading

igv commented Jan 26, 2021 •

edited

Loading