### MATLAB mat files
The file `dmos.mat` has two arrays of length 982 each: dmos and orgs. `orgs(i)==0` for distorted images.
The arrays `dmos` and `orgs` are arranged by concatenating the dmos (and orgs) variables for each database as follows:

```dmos=[dmos_jpeg2000(1:227) dmos_jpeg(1:233) white_noise(1:174) gaussian_blur(1:174) fast_fading(1:174)]```
 
where `dmos_distortion(i)` is the dmos value for image `distortion/img_i.bmp` where distortion can be one of the five
described above. 

The values of dmos when corresponding `orgs==1` are zero (they are reference  images). Note that imperceptible
loss of quality does not necessarily mean a dmos value of zero due to the   nature of the score processing used.

The file refnames_all.mat contains a cell array refnames_all. Entry refnames_all{i} is the name of
the reference image for image i whose dmos value is given by dmos(i). If orgs(i)==0, then this is a valid

In [339]:
import os
import re
from enum import unique, IntEnum

import numpy
import pandas
from scipy.io import loadmat


In [340]:
paths = list(os.walk('/home/rocampo/data/live'))
files = {
    os.path.basename(path): [os.path.join(os.path.basename(path), file) for file in files if '.bmp' in file]
    for path, _, files in paths
    if len(list(filter(lambda x: '.bmp' in x, files))) > 0
}


In [341]:
refnames = loadmat('/home/rocampo/data/live/refnames_all.mat')
dmos = loadmat('/home/rocampo/data/live/dmos.mat')


In [342]:
refnames = numpy.hstack(refnames['refnames_all'][0])
dmos = numpy.hstack(dmos['dmos'])


In [343]:
def extract_index(file_name):
    return int(re.findall(r'\d+', file_name)[-1])

def create_array(file_names):
    return {extract_index(file_name): file_name for file_name in file_names}

In [344]:
@unique
class Distortion(IntEnum):
    jpeg_2000 = 1
    jpeg = 2
    white_noise = 3
    gaussian_blur = 4
    fast_fading = 5

arrays = {
    Distortion.jpeg_2000: create_array(files['jp2k']),
    Distortion.jpeg: create_array(files['jpeg']),
    Distortion.white_noise: create_array(files['wn']),
    Distortion.gaussian_blur:  create_array(files['gblur']),
    Distortion.fast_fading:  create_array(files['fastfading']),  
}

In [345]:
dataframes = [
    pandas.DataFrame({
        'distortion': distortion, 
        'index': list(paths.keys()),
        'distorted_path': list(paths.values())
    })
    for distortion, paths in arrays.items() 
]

dataframe = pandas.concat(dataframes)

In [346]:
dataframe = dataframe.sort_values(by=['distortion', 'index'])

In [347]:
dataframe['dmos'] = dmos
dataframe['reference_path'] = refnames
dataframe.reference_path = 'refimgs/' + dataframe.reference_path
dataframe.distortion = dataframe.distortion.apply(lambda x: str(Distortion(x)).replace('Distortion.', ''))
dataframe = dataframe[['distortion', 'index', 'reference_path', 'distorted_path', 'dmos']]

In [348]:
dataframe

Unnamed: 0,distortion,index,reference_path,distorted_path,dmos
18,jpeg_2000,1,refimgs/buildings.bmp,jp2k/img1.bmp,0.000000
0,jpeg_2000,2,refimgs/studentsculpture.bmp,jp2k/img2.bmp,28.003845
17,jpeg_2000,3,refimgs/rapids.bmp,jp2k/img3.bmp,34.010736
35,jpeg_2000,4,refimgs/dancers.bmp,jp2k/img4.bmp,65.131410
52,jpeg_2000,5,refimgs/churchandcapitol.bmp,jp2k/img5.bmp,68.911340
70,jpeg_2000,6,refimgs/dancers.bmp,jp2k/img6.bmp,65.150103
53,jpeg_2000,7,refimgs/churchandcapitol.bmp,jp2k/img7.bmp,54.397266
161,jpeg_2000,8,refimgs/stream.bmp,jp2k/img8.bmp,44.397145
184,jpeg_2000,9,refimgs/cemetry.bmp,jp2k/img9.bmp,0.000000
71,jpeg_2000,10,refimgs/woman.bmp,jp2k/img10.bmp,47.430014
